Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

score_genes:n_bins can be off #1126

Closed
joshua-gould opened this issue Mar 24, 2020 · 5 comments
Closed

score_genes:n_bins can be off #1126

joshua-gould opened this issue Mar 24, 2020 · 5 comments
Labels
Bug 🐛 Needs info❔ More information needed

Comments

@joshua-gould
Copy link
Contributor

Ask for 25, get 26:

n_bins = 25
obs_avg = pd.Series(np.arange(100), index=np.arange(100))
n_items = int(np.round(len(obs_avg) / (n_bins - 1)))
obs_cut = obs_avg.rank(method='min') // n_items
len(obs_cut.unique())

Ask for 25, get 23:

n_bins = 25
values = np.arange(100)
values[0:10] = 0
values[11:20] = 1
obs_avg = pd.Series(values, index=np.arange(100))
n_items = int(np.round(len(obs_avg) / (n_bins - 1)))
obs_cut = obs_avg.rank(method='min') // n_items
len(obs_cut.unique())
@ivirshup
Copy link
Member

What are the cases where this happens?

@joshua-gould
Copy link
Contributor Author

See examples above. Thanks.

@ivirshup ivirshup added the Needs info❔ More information needed label Apr 17, 2020
@ivirshup
Copy link
Member

Could you provide an example of what’s going wrong? There’s no scanpy code in your example.

@joshua-gould
Copy link
Contributor Author

The code I included in my example and below is taken directly from the scanpy.tl.score_genes source code:

n_items = int(np.round(len(obs_avg) / (n_bins - 1)))
obs_cut = obs_avg.rank(method='min') // n_items

@flying-sheep
Copy link
Member

fixed in the next version!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug 🐛 Needs info❔ More information needed
Projects
None yet
Development

No branches or pull requests

3 participants