Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create a comparison of imputation strategies #86

Merged

Conversation

anilkumarpanda
Copy link
Contributor

@anilkumarpanda anilkumarpanda commented Mar 4, 2021

Initial draft of the imputation strategy comparison . Resolves #73
Adds the following feature to Probatus :

  • Given a set of imputation strategies, show which one performs the best for the given classifier.
  • For categorical variables, the missing values are filled with 'missing' and a missing indicator is added.
  • For numerical values, various imputation strategies are applied.
  • The output is provided in the form of an bar plot as well as report dataframe with the results. e.g.
    image
  • For models that support handling of missing values by default Model Imputation result is displayed. For models that do not support handling of missing values. Only the strategy results are displayed.

@anilkumarpanda
Copy link
Contributor Author

@Matgrb can you please review this.

Copy link
Contributor

@Matgrb Matgrb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great progress! I left some comments, most improvements can be done on the code side and documentation.

docs/tutorials/nb_imputation_comparison.ipynb Show resolved Hide resolved
docs/tutorials/nb_imputation_comparison.ipynb Show resolved Hide resolved
docs/tutorials/nb_imputation_comparison.ipynb Show resolved Hide resolved
docs/tutorials/nb_imputation_comparison.ipynb Outdated Show resolved Hide resolved
docs/tutorials/nb_imputation_comparison.ipynb Outdated Show resolved Hide resolved
probatus/missing/imputation.py Outdated Show resolved Hide resolved
probatus/missing/imputation.py Outdated Show resolved Hide resolved
probatus/missing/imputation.py Outdated Show resolved Hide resolved
tests/missing/test_imputation.py Outdated Show resolved Hide resolved
tests/missing/test_imputation.py Outdated Show resolved Hide resolved
Copy link
Contributor

@Matgrb Matgrb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fantastic! I left couple of minor comments, as soon as they are closed we can merge and make a release!

probatus/missing_values/imputation.py Show resolved Hide resolved
probatus/missing_values/imputation.py Outdated Show resolved Hide resolved
probatus/missing_values/imputation.py Outdated Show resolved Hide resolved
probatus/missing_values/imputation.py Show resolved Hide resolved
probatus/missing_values/imputation.py Outdated Show resolved Hide resolved
probatus/missing_values/imputation.py Outdated Show resolved Hide resolved
probatus/missing_values/imputation.py Show resolved Hide resolved
probatus/missing_values/imputation.py Outdated Show resolved Hide resolved
docs/tutorials/nb_imputation_comparison.ipynb Show resolved Hide resolved
docs/tutorials/nb_imputation_comparison.ipynb Show resolved Hide resolved
Copy link
Contributor

@Matgrb Matgrb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work! Very high quality contribution!

@Matgrb Matgrb merged commit 97b0661 into ing-bank:main Mar 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Imputation method selection using Metric Volatility
2 participants