Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Calculate latency for each batch instead of just using the average for all queries #433

Merged
merged 2 commits into from
Oct 15, 2023

Conversation

KShivendu
Copy link
Contributor

@KShivendu KShivendu commented Jul 2, 2023

We should calculate latency for each batch instead of just using the average because it will be more precise. At the moment, all "times" values are the same as the average which really defeats the purpose of p99 and p95 values.

Plus, I noticed a bug while running with --batch --local. All the algorithms have been refactored to have a module.py but the runner wasn't updated accordingly which makes it fail when running it locally with batches. To keep the change minimal I just modified definitions.py instead of all the config files.

I just did this for Qdrant right now but can do it for others as well once this PR is approved and merged :)

Also, thanks for creating/maintaining this repo. Really great work!

@erikbern
Copy link
Owner

erikbern commented Jul 2, 2023

sorry what's the difference between batch latencies and non-batch latencies?

@KShivendu
Copy link
Contributor Author

KShivendu commented Jul 2, 2023

So when running without --batch, the code calculates time spent in querying individually for each query which is correct. Source

But while running in batches (i.e. with --batch), the code takes the total time spent on running all the queries and then divides it by the number of queries (say 10K) which makes all the (10K) entries in the "times" dataset exactly the same. Source

angular

This PR will make the entries in the "times" dataset different and more precise because it will calculate the time taken for each batch and then divide it by the number of queries in that batch. It's still not perfect but it's really the best we can do. And as I mentioned previously, now p99 and p95 will make sense because "times" values will actually be different.

Does this answer your question?

@KShivendu KShivendu changed the title feat: Calculate latency for each batch instead of just using the average feat: Calculate latency for each batch instead of just using the average for all queries Jul 2, 2023
@erikbern
Copy link
Owner

erikbern commented Jul 3, 2023

Not entirely following. I guess there's no obvious definition of what p90 means in batched mode. What are you suggesting is the right way to define it?

@KShivendu
Copy link
Contributor Author

What are you suggesting is the right way to define it?

Not the perfect way. But, yeah, I think the best we can do when using batches because we don't know about individual queries in batch mode. The smaller the batch size, the better the approximation obviously.

Tried a small experiment on replit.com to test this out: https://replit.com/@KShivendu/SecondaryLegalRelationalmodel#main.py

Here are the results:

  1. On running each query individually (true values)
  • p90: 0.8946032825299363
  • p99: 0.9905511176199423
  1. Current approach of using the average of all queries:
  • p90: 0.4918370372081173
  • p99: 0.4918370372081173 (Same as p90 because all entries are the same)
  1. Proposed approach of using the average for each batch:
  • p90: 0.6048797184334235
  • p99: 0.7017439701776231

@KShivendu
Copy link
Contributor Author

KShivendu commented Jul 5, 2023

@erikbern any thoughts on this? ^

@KShivendu
Copy link
Contributor Author

@erikbern pinging in case it wasn't noticed :) ^

@maumueller
Copy link
Collaborator

I didn't have batch sizes in mind when I worked on the batch mode, your proposed change make sense to me @KShivendu. (Also seems to be a very minimal change.)

@erikbern erikbern merged commit 271c9fb into erikbern:main Oct 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants