Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[QUESTION] Clarification Needed on "global" Label in Metrics from CloudWatch Exporter #1383

Closed
1 task done
psujit775 opened this issue Apr 14, 2024 · 2 comments
Closed
1 task done
Labels
bug Something isn't working

Comments

@psujit775
Copy link

psujit775 commented Apr 14, 2024

Is there an existing issue for this?

  • I have searched the existing issues

YACE version

v0.34.0-alpha

Config file

apiVersion: v1alpha1
discovery:
  jobs:
  - type: nlb
    regions:
    - us-east-1
    searchTags:
    - key: some-tag
      value: true
    awsDimensions:
    - LoadBalancer
    delay: 180
    addCloudwatchTimestamp: true
    metrics:
    - name: ActiveFlowCount
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 180
      nilToZero: true
    - name: ActiveFlowCount_TCP
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 180
      nilToZero: true
    - name: ActiveFlowCount_TLS
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 180
      nilToZero: true
    - name: ActiveFlowCount_UDP
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 180
      nilToZero: true
    - name: ClientTLSNegotiationErrorCount
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 180
      nilToZero: true
    - name: ConsumedLCUs
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 180
      nilToZero: true
    - name: ConsumedLCUs_TCP
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 180
      nilToZero: true
    - name: ConsumedLCUs_TLS
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 180
      nilToZero: true
    - name: ConsumedLCUs_UDP
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 180
      nilToZero: true
    - name: NewFlowCount
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 180
      nilToZero: true
    - name: NewFlowCount_TCP
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 180
      nilToZero: true
    - name: NewFlowCount_TLS
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 180
      nilToZero: true
    - name: NewFlowCount_UDP
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 180
      nilToZero: true
    - name: ProcessedBytes
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 180
      nilToZero: true
    - name: ProcessedBytes_TCP
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 180
      nilToZero: true
    - name: ProcessedBytes_TLS
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 180
      nilToZero: true
    - name: ProcessedBytes_UDP
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 180
      nilToZero: true
    - name: TargetTLSNegotiationErrorCount
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 180
      nilToZero: true
    - name: TCP_Client_Reset_Count
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 180
      nilToZero: true
    - name: TCP_ELB_Reset_Count
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 180
      nilToZero: true
    - name: TCP_Target_Reset_Count
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 180
      nilToZero: true
  - type: nlb
    regions:
    - us-east-1
    searchTags:
    - key: some-tag
      value: true
    awsDimensions:
    - LoadBalancer
    - TargetGroup
    delay: 180
    addCloudwatchTimestamp: true
    metrics:
    - name: HealthyHostCount
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 180
      nilToZero: true
    - name: UnHealthyHostCount
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 180  
      nilToZero: true
  - type: elb
    regions:
    - us-east-1
    searchTags:
    - key: some-tag
      value: true
    awsDimensions:
    - LoadBalancerName
    delay: 120
    addCloudwatchTimestamp: true
    metrics:
    - name: BackendConnectionErrors
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 120
      nilToZero: true
    - name: HealthyHostCount
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 120
      nilToZero: true
    - name: HTTPCode_Backend_2XX
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 120
      nilToZero: true
    - name: HTTPCode_Backend_3XX
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 120
      nilToZero: true
    - name: HTTPCode_Backend_4XX
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 120
      nilToZero: true
    - name: HTTPCode_Backend_5XX
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 120
      nilToZero: true
    - name: HTTPCode_ELB_4XX
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 120
      nilToZero: true
    - name: HTTPCode_ELB_5XX
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 120
      nilToZero: true
    - name: Latency
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 120
      nilToZero: true
    - name: RequestCount
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 120
      nilToZero: true
    - name: SpilloverCount
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 120
      nilToZero: true
    - name: SurgeQueueLength
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 120
      nilToZero: true
    - name: UnHealthyHostCount
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 120
      nilToZero: true
  - type: rds
    regions:
    - us-east-1
    searchTags:
    - key: some-tag
      value: true
    awsDimensions:
    - DBInstanceIdentifier
    delay: 120
    addCloudwatchTimestamp: true
    metrics:
    - name: BinLogDiskUsage
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 120
      nilToZero: true
    - name: BurstBalance
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 120
      nilToZero: true
    - name: CPUCreditBalance
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 120
      nilToZero: true
    - name: CPUCreditUsage
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 120
      nilToZero: true
    - name: CPUSurplusCreditBalance
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 120
      nilToZero: true
    - name: CPUSurplusCreditsCharged
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 120
      nilToZero: true
    - name: CPUUtilization
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 120
      nilToZero: true
    - name: DatabaseConnections
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 120
      nilToZero: true
    - name: DBLoad
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 120
      nilToZero: true
    - name: DBLoadCPU
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 120
      nilToZero: true
    - name: DBLoadNonCPU
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 120
      nilToZero: true
    - name: DiskQueueDepth
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 120
      nilToZero: true
    - name: FreeableMemory
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 120
      nilToZero: true
    - name: FreeStorageSpace
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 120
      nilToZero: true
    - name: LVMReadIOPS
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 120
      nilToZero: true
    - name: LVMWriteIOPS
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 120
      nilToZero: true
    - name: NetworkReceiveThroughput
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 120
      nilToZero: true
    - name: NetworkTransmitThroughput
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 120
      nilToZero: true
    - name: ReadIOPS
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 120
      nilToZero: true
    - name: ReadLatency
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 120
      nilToZero: true
    - name: ReadThroughput
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 120
      nilToZero: true
    - name: ReplicaLag
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 120
      nilToZero: true
    - name: SwapUsage
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 120
      nilToZero: true
    - name: WriteIOPS
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 120
      nilToZero: true
    - name: WriteLatency
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 120
      nilToZero: true
    - name: WriteThroughput
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 120
      nilToZero: true
  - type: alb
    regions:
    - us-east-1
    searchTags:
    - key: some-tag
      value: true
    awsDimensions:
    - LoadBalancer
    delay: 120
    addCloudwatchTimestamp: true
    metrics:
    - name: ActiveConnectionCount
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 120
      nilToZero: true
    - name: ClientTLSNegotiationErrorCount
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 120
      nilToZero: true
    - name: HTTPCode_ELB_3XX_Count
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 120
      nilToZero: true
    - name: HTTPCode_ELB_4XX_Count
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 120
      nilToZero: true
    - name: HTTPCode_ELB_500_Count
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 120
      nilToZero: true
    - name: HTTPCode_ELB_502_Count
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 120
      nilToZero: true
    - name: HTTPCode_ELB_503_Count
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 120
      nilToZero: true
    - name: HTTPCode_ELB_504_Count
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 120
      nilToZero: true
    - name: HTTPCode_ELB_5XX_Count
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 120
      nilToZero: true
    - name: HTTPCode_Target_2XX_Count
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 120
      nilToZero: true
    - name: HTTPCode_Target_3XX_Count
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 120
      nilToZero: true
    - name: HTTPCode_Target_4XX_Count
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 120
      nilToZero: true
    - name: HTTPCode_Target_5XX_Count
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 120
      nilToZero: true
    - name: NewConnectionCount
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 120
      nilToZero: true
    - name: RejectedConnectionCount
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 120
      nilToZero: true
    - name: RequestCount
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 120
      nilToZero: true
    - name: RuleEvaluations
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 120
      nilToZero: true
    - name: TargetConnectionErrorCount
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 120
      nilToZero: true
    - name: TargetResponseTime
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 120
      nilToZero: true
    - name: HealthyHostCount
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 120
      nilToZero: true
    - name: RequestCountPerTarget
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 120
      nilToZero: true
    - name: TargetTLSNegotiationErrorCount
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 120
      nilToZero: true
    - name: UnHealthyHostCount
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 120
      nilToZero: true
  - type: sqs
    regions:
    - us-east-1
    searchTags:
    - key: some-tag
      value: true
    awsDimensions:
    - QueueName
    delay: 120
    addCloudwatchTimestamp: true
    metrics:
    - name: ApproximateAgeOfOldestMessage
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 120
      nilToZero: true
    - name: ApproximateNumberOfMessagesDelayed
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 120
      nilToZero: true
    - name: ApproximateNumberOfMessagesNotVisible
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 120
      nilToZero: true
    - name: ApproximateNumberOfMessagesVisible
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 120
      nilToZero: true
    - name: NumberOfEmptyReceives
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 120
      nilToZero: true
    - name: NumberOfMessagesDeleted
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 120
      nilToZero: true
    - name: NumberOfMessagesReceived
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 120
      nilToZero: true
    - name: NumberOfMessagesSent
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 120
      nilToZero: true
    - name: SentMessageSize
      statistics: [Average, Minimum, Maximum, Sum]
      period: 60
      length: 60
      delay: 120
      nilToZero: true
  # https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/programming-cloudwatch-metrics.html#cloudfront-metrics-distribution-values
  - type: cloudfront
    regions:
    - us-east-1
    searchTags:
    - key: some-tag
      value: true
    awsDimensions:
    - DistributionId
    delay: 120
    addCloudwatchTimestamp: true
    metrics:
    - name: 4xxErrorRate
      statistics: [Average]
      period: 60
      length: 60
      delay: 120
      nilToZero: true
    - name: 5xxErrorRate
      statistics: [Average]
      period: 60
      length: 60
      delay: 120
      nilToZero: true
    - name: BytesDownloaded
      statistics: [Sum]
      period: 60
      length: 60
      delay: 120
      nilToZero: true
    - name: BytesUploaded
      statistics: [Sum]
      period: 60
      length: 60
      delay: 120
      nilToZero: true
    - name: TotalErrorRate
      statistics: [Average]
      period: 60
      length: 60
      delay: 120
      nilToZero: true
    - name: Requests
      statistics: [Sum]
      period: 60
      length: 60
      delay: 120
      nilToZero: true

Current Behavior

Observed some metrics exposed with the label name global. This label is seen in metrics that lack specific instance identifiers. Below are examples of such metrics:

aws_rds_read_latency_average{account_id="12345678990",dimension_DBInstanceIdentifier="",dimension_DatabaseClass="",dimension_EngineName="",name="global",region="us-east-1"} 0.0153301782795497 1713084660000
aws_rds_read_latency_average{account_id="12345678990",dimension_DBInstanceIdentifier="",dimension_DatabaseClass="",dimension_EngineName="mysql",name="global",region="us-east-1"} 0.0153301782795497 1713084660000
aws_rds_read_latency_average{account_id="12345678990",dimension_DBInstanceIdentifier="",dimension_DatabaseClass="db.r5.2xlarge",dimension_EngineName="",name="global",region="us-east-1"} 0.0007798980581246909 1713084660000
aws_rds_read_latency_average{account_id="12345678990",dimension_DBInstanceIdentifier="",dimension_DatabaseClass="db.r5.4xlarge",dimension_EngineName="",name="global",region="us-east-1"} 0.03986423938264264 1713084660000
aws_rds_read_latency_average{account_id="12345678990",dimension_DBInstanceIdentifier="",dimension_DatabaseClass="db.r5.xlarge",dimension_EngineName="",name="global",region="us-east-1"} 0.0007090139140955837 1713084660000
aws_rds_read_latency_average{account_id="12345678990",dimension_DBInstanceIdentifier="",dimension_DatabaseClass="db.r6g.xlarge",dimension_EngineName="",name="global",region="us-east-1"} 0 1713084660000

I'm also getting the metrics with proper dimension_DBInstanceIdentifier for each rds instance. So, what is this extra metrics with name="global"?

Expected Behavior

No response

Steps To Reproduce

Running in eks cluster with below commands

/bin/yace --config.file=/etc/cloudwatch-exporter/cloudwatch-exporter-config.yaml

Anything else?

Above issue is for all the RDS related metrics mentioned in config file.

@psujit775 psujit775 added the bug Something isn't working label Apr 14, 2024
@kgeckhart
Copy link
Contributor

kgeckhart commented Apr 17, 2024

These are metrics which do not have dimensions which can be linked back to a specific resource, https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/dimensions.html. This is working as intended as the exporter is responsible for exporting the data exactly how it exists in CloudWatch.

You could use a prom relabel to drop these when scraping YACE. I believe if you add dimensionNameRequirements set to DBInstanceIdentifier it will drop all the versions which produce global. A dimensionNameRequirements will save you money in AWS API calls as it's applied before GetMetricData is called.

@psujit775
Copy link
Author

makes sense. Thanks @kgeckhart.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants