Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run ListMetrics calls with goroutines. #53

Draft
wants to merge 13 commits into
base: live
Choose a base branch
from
Draft
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Run ListMetrics calls with goroutines.
Execute ListMetrics calls in separate goroutines (one for each metric),
in a similar way to how GetMetricData requests are handled.

This change removes semaphore usage (it's not used in GetMetricData as
well) for the sake of making it easier to reason about the code. It can
easily be added back in case we hit any issue around e.g. rate limits.

In local tests, the speedup seems to be particularly effective when
requesting 4 or more metrics in parallel (e.g. especially with EC2/EBS).
  • Loading branch information
cristiangreco committed May 30, 2022
commit 7fa145567924e8661351d6c80905f2afbfe723a1
39 changes: 25 additions & 14 deletions pkg/abstract.go
Original file line number Diff line number Diff line change
Expand Up @@ -183,26 +183,37 @@ func getMetricDataForQueries(
logger log.Logger) []cloudwatchData {
var getMetricDatas []cloudwatchData

mux := &sync.Mutex{}
var wg sync.WaitGroup

// For every metric of the job
for _, metric := range discoveryJob.Metrics {
// Get the full list of metrics
// This includes, for this metric the possible combinations
// of dimensions and value of dimensions with data
tagSemaphore <- struct{}{}
wg.Add(1)
go func(m *Metric) {
defer wg.Done()
// Get the full list of metrics
// This includes, for this metric the possible combinations
// of dimensions and value of dimensions with data
metricsList, err := getFullMetricsList(ctx, svc.Namespace, m, clientCloudwatch)

if err != nil {
level.Error(logger).Log("msg", "Failed to get full metric list", "err", err, "metric_name", m.Name, "namespace", svc.Namespace)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could add a prom metrics to track both this errors and the scenario below with zero resources?

return
}

metricsList, err := getFullMetricsList(ctx, svc.Namespace, metric, clientCloudwatch)
<-tagSemaphore
if len(resources) == 0 {
level.Debug(logger).Log("msg", "No resources for metric", "err", err, "metric_name", m.Name, "namespace", svc.Namespace)
}

if err != nil {
level.Error(logger).Log("msg", "Failed to get full metric list", "err", err, "metric_name", metric.Name, "namespace", svc.Namespace)
continue
}
metricDatas := getFilteredMetricDatas(region, accountId, discoveryJob.Type, discoveryJob.CustomTags, tagsOnMetrics, svc.DimensionRegexps, resources, metricsList.Metrics, m)

if len(resources) == 0 {
level.Debug(logger).Log("msg", "No resources for metric", "err", err, "metric_name", metric.Name, "namespace", svc.Namespace)
}
getMetricDatas = append(getMetricDatas, getFilteredMetricDatas(region, accountId, discoveryJob.Type, discoveryJob.CustomTags, tagsOnMetrics, svc.DimensionRegexps, resources, metricsList.Metrics, metric)...)
mux.Lock()
getMetricDatas = append(getMetricDatas, metricDatas...)
mux.Unlock()
}(metric)
}

wg.Wait()
return getMetricDatas
}

Expand Down