YACE reports "Service is not in known list!: containerInsights", Pod run into error #1480

huiweiguozi · 2024-07-29T01:03:25Z

Is there an existing issue for this?

I have searched the existing issues

YACE version

No response

Config file

values.yaml:

# Default values for yet-another-cloudwatch-exporter.
# This is a YAML-formatted file.
# Declare variables to be passed into your templates.

replicaCount: 1

image:
  registry: ghcr.io
  repository: nerdswords/yet-another-cloudwatch-exporter
  pullPolicy: Always #IfNotPresent
  # Overrides the image tag whose default is the chart appVersion.
  tag: ""

imagePullSecrets: []
nameOverride: ""
fullnameOverride: ""

serviceAccount:
  # -- Specifies whether a service account should be created
  create: true
  # -- Labels to add to the service account
  labels: {}
  # -- Annotations to add to the service account
  annotations:
    eks.amazonaws.com/role-arn: arn:aws:iam::<account_id>:role/cwexporter-same_account-role #grant CloudwatchFullAccess to this role
    eks.amazonaws.com/sts-regional-endpoints: "true"
  # -- The name of the service account to use.
  # If not set and create is true, a name is generated using the fullname template
  name: eks-service-account

podAnnotations:
  prometheus.io/scrape: "true"
  prometheus.io/path: "/metrics"
#  prometheus.io/port: "9106"

podLabels: {}

portName: http

podSecurityContext: {}
  # fsGroup: 2000

securityContext: {}
  # capabilities:
  #   drop:
  #   - ALL
  # readOnlyRootFilesystem: true
  # runAsNonRoot: true
  # runAsUser: 1000

service:
  type: ClusterIP
  port: 9106
  # -- Annotations to add to the service
  annotations: {}

testConnection: true

ingress:
  enabled: false
  className: ""
  annotations: {}
    # kubernetes.io/ingress.class: nginx
    # kubernetes.io/tls-acme: "true"
  hosts:
    - host: chart-example.local
      paths:
        - path: /
          pathType: ImplementationSpecific
  tls: []
  #  - secretName: chart-example-tls
  #    hosts:
  #      - chart-example.local

resources: {}
  # We usually recommend not to specify default resources and to leave this as a conscious
  # choice for the user. This also increases chances charts run on environments with little
  # resources, such as Minikube. If you do want to specify resources, uncomment the following
  # lines, adjust them as necessary, and remove the curly braces after 'resources:'.
  # limits:
  #   cpu: 100m
  #   memory: 128Mi
  # requests:
  #   cpu: 100m
  #   memory: 128Mi

nodeSelector: {}
priorityClassName:

tolerations: []

affinity: {}

extraEnv: []
  # Define extra environmental variables list as follows
  # - name : key1
  #   value: value1

extraEnvFrom: []
  # Define extra environmental variables from secrets or configmaps
  # - secretRef:
  #     name: secrets

extraArgs: {}
  # scraping-interval: 300

extraVolumeMounts: []
  # Additional volumeMounts to the container.
  # - name: secrets-store01-inline
  #   mountPath: /mnt/secrets-store

extraVolumes: []
# Additional volumes to the pod.
# - csi:
#     driver: secrets-store.csi.k8s.io
#     readOnly: true
#     volumeAttributes:
#       secretProviderClass: "secret-csi-provider"
#   name : secrets-store01-inline

aws:
  role:
  # The name of a pre-created secret in which AWS credentials are stored. When
  # set, aws_access_key_id is assumed to be in a field called access_key,
  # aws_secret_access_key is assumed to be in a field called secret_key, and the
  # session token, if it exists, is assumed to be in a field called
  # security_token
  secret:
    name:
    includesSessionToken: false

  # Note: Do not specify the aws_access_key_id and aws_secret_access_key if you specified role or secret.name before
  aws_access_key_id:
  aws_secret_access_key:

serviceMonitor:
  # When set true then use a ServiceMonitor to configure scraping
  enabled: false
  # Set the namespace the ServiceMonitor should be deployed
  # namespace: monitoring
  # Set how frequently Prometheus should scrape
  # interval: 30s
  # Set path to cloudwatch-exporter telemtery-path
  # telemetryPath: /metrics
  # Set labels for the ServiceMonitor, use this to define your scrape label for Prometheus Operator
  # labels:
  # Set timeout for scrape
  # timeout: 10s
  # Set relabelings for the ServiceMonitor, use to apply to samples before scraping
  # relabelings: []
  # Set metricRelabelings for the ServiceMonitor, use to apply to samples for ingestion
  # metricRelabelings: []
  #
  # Example - note the Kubernetes convention of camelCase instead of Prometheus' snake_case
  # metricRelabelings:
  #   - sourceLabels: [dbinstance_identifier]
  #     action: replace
  #     replacement: mydbname
  #     targetLabel: dbname

prometheusRule:
  # Specifies whether a PrometheusRule should be created
  enabled: false
  # Set the namespace the PrometheusRule should be deployed
  # namespace: monitoring
  # Set labels for the PrometheusRule, use this to define your scrape label for Prometheus Operator
  # labels:
  # Example - note the Kubernetes convention of camelCase instead of Prometheus'
  # rules:
  #    - alert: ELB-Low-BurstBalance
  #      annotations:
  #        message: The ELB BurstBalance during the last 10 minutes is lower than 80%.
  #      expr: aws_ebs_burst_balance_average < 80
  #      for: 10m
  #      labels:
  #        severity: warning
  #    - alert: ELB-Low-BurstBalance
  #      annotations:
  #        message: The ELB BurstBalance during the last 10 minutes is lower than 50%.
  #      expr: aws_ebs_burst_balance_average < 50
  #      for: 10m
  #      labels:
  #        severity: warning
  #    - alert: ELB-Low-BurstBalance
  #      annotations:
  #        message: The ELB BurstBalance during the last 10 minutes is lower than 30%.
  #      expr: aws_ebs_burst_balance_average < 30
  #      for: 10m
  #      labels:
  #        severity: critical

    
#config: |-
#  apiVersion: v1alpha1
#  sts-region: us-east-1
#  discovery:
#    exportedTagsOnMetrics:
#      ec2:
#        - Name
#    jobs:
#    - type: ec2
#      regions:
#        - us-west-2
#      period: 60
#      length: 600
#      metrics:
#        - name: CPUUtilization
#          statistics: [Average]
#        - name: NetworkIn
#          statistics: [Average, Sum]
#        - name: NetworkOut
#          statistics: [Average, Sum]
#        - name: NetworkPacketsIn
#          statistics: [Sum]
#        - name: NetworkPacketsOut
#          statistics: [Sum]
#        - name: DiskReadBytes
#          statistics: [Sum]
#        - name: DiskWriteBytes
#          statistics: [Sum]
#        - name: DiskReadOps
#          statistics: [Sum]
#        - name: DiskWriteOps
#          statistics: [Sum]
#        - name: StatusCheckFailed
#          statistics: [Sum]
#        - name: StatusCheckFailed_Instance
#          statistics: [Sum]
#        - name: StatusCheckFailed_System
#          statistics: [Sum]
#        - name: StatusCheckFailed_AttachedEBS
#          statistics: [Sum]
config: |-
  apiVersion: v1alpha1
  discovery:
    jobs:
      - type: ContainerInsights
        regions:
          - us-west-2
        period: 300
        length: 300
        metrics:
          - name: apiserver_storage_objects
            statistics: [Sum]

Current Behavior

helm install mychart nerdswords/yet-another-cloudwatch-exporter --values ./values.yaml

Prometheus configuration file values.yaml works for other services, and modified below snippets to ContainerInsights, and pod could not start after running the above command:

config: |-
apiVersion: v1alpha1
discovery:
jobs:

type: ContainerInsights
regions:
us-west-2
period: 300
length: 300
metrics:
name: apiserver_storage_objects
statistics: [Sum]

Checking into pod logs, saying:
kubectl logs mychart-yet-another-cloudwatch-exporter-5c55b49fc4-rb66c
{"caller":"main.go:240","level":"info","msg":"Parsing config","ts":"2024-07-29T00:40:23.769086031Z","version":"v0.57.1"}
{"caller":"main.go:67","err":"Couldn't read /config/config.yml: Discovery job [0]: Service is not in known list!: containerInsights","level":"error","msg":"Error running yace","ts":"2024-07-29T00:40:23.769601042Z","version":"v0.57.1"}

Question: What's the correct configuration to monitor EKS via ContainerInsights?

Expected Behavior

Question: What's the correct configuration to monitor EKS via ContainerInsights?

Steps To Reproduce

As above in "Current Behavior"

Anything else?

No response

The text was updated successfully, but these errors were encountered:

huiweiguozi · 2024-07-29T07:12:27Z

YACE version: v0.57.1

vainiusd · 2024-08-05T06:49:15Z

Value for Container Insights namespace should be ECS/ContainerInsights or it's alias ecs-containerinsights.
For Your stated version:
https://github.com/nerdswords/yet-another-cloudwatch-exporter/blob/v0.57.1/pkg/config/services.go#L350

huiweiguozi · 2024-08-12T19:32:11Z

Hi vainiusd, thanks!

I'm trying to monitor EKS, does this work for EKS? Because according to https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/Container-Insights-metrics-EKS.html, it seems EKS uses namespace ContainerInsights rather than ECS/ContainerInsights as in https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/Container-Insights-metrics-ECS.html.

huiweiguozi added the bug Something isn't working label Jul 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

YACE reports "Service is not in known list!: containerInsights", Pod run into error #1480

YACE reports "Service is not in known list!: containerInsights", Pod run into error #1480

huiweiguozi commented Jul 29, 2024

huiweiguozi commented Jul 29, 2024

vainiusd commented Aug 5, 2024

huiweiguozi commented Aug 12, 2024

YACE reports "Service is not in known list!: containerInsights", Pod run into error #1480

YACE reports "Service is not in known list!: containerInsights", Pod run into error #1480

Comments

huiweiguozi commented Jul 29, 2024

Is there an existing issue for this?

YACE version

Config file

Current Behavior

Expected Behavior

Steps To Reproduce

Anything else?

huiweiguozi commented Jul 29, 2024

vainiusd commented Aug 5, 2024

huiweiguozi commented Aug 12, 2024