Skip to content

Latest commit

 

History

History
313 lines (266 loc) · 13.6 KB

File metadata and controls

313 lines (266 loc) · 13.6 KB

Attributes Processor

Status
Stability beta: traces, metrics, logs
Distributions core, contrib, k8s
Warnings Identity Conflict
Issues Open issues Closed issues
Code Owners @boostchicken

The attributes processor modifies attributes of a span, log, or metric. Please refer to config.go for the config spec.

This processor also supports the ability to filter and match input data to determine if they should be included or excluded for specified actions.

It takes a list of actions which are performed in order specified in the config. The supported actions are:

  • insert: Inserts a new attribute in input data where the key does not already exist.
  • update: Updates an attribute in input data where the key does exist.
  • upsert: Performs insert or update. Inserts a new attribute in input data where the key does not already exist and updates an attribute in input data where the key does exist.
  • delete: Deletes an attribute from the input data.
  • hash: Hashes (SHA1) an existing attribute value.
  • extract: Extracts values using a regular expression rule from the input key to target keys specified in the rule. If a target key already exists, it will be overridden. Note: It behaves similar to the Span Processor to_attributes setting with the existing attribute as the source.
  • convert: Converts an existing attribute to a specified type.

For the actions insert, update and upsert,

  • key is required
  • one of value, from_attribute or from_context is required
  • action is required.
  # Key specifies the attribute to act upon.
- key: <key>
  action: {insert, update, upsert}
  # Value specifies the value to populate for the key.
  # The type is inferred from the configuration.
  value: <value>

  # Key specifies the attribute to act upon.
- key: <key>
  action: {insert, update, upsert}
  # FromAttribute specifies the attribute from the input data to use to populate
  # the value. If the attribute doesn't exist, no action is performed.
  from_attribute: <other key>

  # Key specifies the attribute to act upon.
- key: <key>
  action: {insert, update, upsert}
  # FromContext specifies the context value to use to populate the attribute value. 
  # If the key is prefixed with `metadata.`, the values are searched
  # in the receiver's transport protocol additional information like gRPC Metadata or HTTP Headers
  # (be sure to set `include_metadata: true` on the receiver).
  # If the key is prefixed with `auth.`, the values are searched
  # in the authentication information set by the server authenticator.
  # Refer to the server authenticator's documentation part of your pipeline for more information about which attributes are available.
  # If the key is `client.address`, the value will be set to the client address. 
  # If the key doesn't exist, no action is performed.
  # If the key has multiple values the values will be joined with `;` separator.
  from_context: <other key>

For the delete action,

  • key and/or pattern is required
  • action: delete is required.
# Key specifies the attribute to act upon.
- key: <key>
  action: delete
  # Rule specifies the regex pattern for attribute names to act upon.
  pattern: <regular pattern>

For the hash action,

  • key and/or pattern is required
  • action: hash is required.
# Key specifies the attribute to act upon.
- key: <key>
  action: hash
  # Rule specifies the regex pattern for attribute names to act upon.
  pattern: <regular pattern>

For the extract action,

  • key is required
  • pattern is required.
# Key specifies the attribute to extract values from.
# The value of `key` is NOT altered.
- key: <key>
 # Rule specifies the regex pattern used to extract attributes from the value
 # of `key`.
 # The submatchers must be named.
 # If attributes already exist, they will be overwritten.
 pattern: <regular pattern with named matchers>
 action: extract

For the convert action,

  • key is required
  • action: convert is required.
  • converted_type is required and must be one of int, double or string
# Key specifies the attribute to act upon.
- key: <key>
  action: convert
  converted_type: <int|double|string>

The list of actions can be composed to create rich scenarios, such as back filling attribute, copying values to a new key, redacting sensitive information. The following is a sample configuration.

processors:
  attributes/example:
    actions:
      - key: db.table
        action: delete
      - key: redacted_span
        value: true
        action: upsert
      - key: copy_key
        from_attribute: key_original
        action: update
      - key: account_id
        value: 2245
        action: insert
      - key: account_password
        action: delete
      - key: account_email
        action: hash
      - key: http.status_code
        action: convert
        converted_type: int

Refer to config.yaml for detailed examples on using the processor.

Attributes Processor for Metrics vs. Metric Transform Processor

Regarding metric support, these two processors have overlapping functionality. They can both do simple modifications of metric attribute key-value pairs. As a general rule the attributes processor has more attribute related functionality, while the metrics transform processor can do much more data manipulation. The attributes processor is preferred when the only needed functionality is overlapping, as it natively uses the official OpenTelemetry data model. However, if the metric transform processor is already in use or its extra functionality is necessary, there's no need to migrate away from it.

Shared functionality

  • Add attributes
  • Update values of attributes

Attribute processor specific functionality

  • delete
  • hash
  • extract

Metric transform processor specific functionality

  • Rename metrics
  • Delete data points
  • Toggle data type
  • Scale value
  • Aggregate across label sets
  • Aggregate across label values

Include/Exclude Filtering

The attribute processor exposes an option to provide a set of properties of a span, log or metric record to match against to determine if the input data should be included or excluded from the processor. To configure this option, under include and/or exclude at least match_type and one of the following is required:

  • For spans, one of services, span_names, span_kinds, attributes, resources or libraries must be specified with a non-empty value for a valid configuration. The log_bodies, log_severity_texts, log_severity_number and metric_names fields are invalid.
  • For logs, one of log_bodies, log_severity_texts, log_severity_number, attributes, resources or libraries must be specified with a non-empty value for a valid configuration. The span_names, span_kinds, metric_names and services fields are invalid.
  • For metrics, metric_names must be specified with a valid non-empty value for a valid configuration. The span_names, span_kinds, resources, log_bodies, log_severity_texts, log_severity_number, services, attributes and libraries fields are invalid.

Note: If both include and exclude are specified, the include properties are checked before the exclude properties.

attributes:
    # include and/or exclude can be specified. However, the include properties
    # are always checked before the exclude properties.
    {include, exclude}:
      # At least one of services, span_names or attributes must be specified.
      # It is supported to have more than one specified, but all of the specified
      # conditions must evaluate to true for a match to occur.

      # match_type controls how items in "services", "span_names", and "attributes"
      # arrays are interpreted. Possible values are "regexp" or "strict".
      # This is a required field.
      match_type: {strict, regexp}

      # regexp is an optional configuration section for match_type regexp.
      regexp:
        # < see "Match Configuration" below >

      # services specify an array of items to match the service name against.
      # A match occurs if the span service name matches at least one of the items.
      # This is an optional field.
      services: [<item1>, ..., <itemN>]

      # resources specifies a list of resources to match against.
      # A match occurs if the input data resources matches at least one of the items.
      # This is an optional field.
      resources:
          # Key specifies the resource to match against.
        - key: <key>
          # Value specifies the exact value to match against.
          # If not specified, a match occurs if the key is present in the resources.
          value: {value}

      # libraries specify a list of items to match the implementation library against.
      # A match occurs if the input data implementation library matches at least one of the items.
      # This is an optional field.
      libraries: [<item1>, ..., <itemN>]
          # Name specifies the library to match against.
        - name: <name>
          # Version specifies the exact version to match against.
          # This is an optional field.
          # If the field is not set, any version will match.
          # If the field is set to an empty string, only an
          # empty string version will match.
          version: {version}

      # The span name must match at least one of the items.
      # This is an optional field.
      span_names: [<item1>, ..., <itemN>]

      # The span kind must match at least one of the items.
      # This is an optional field.
      span_kinds: [<item1>, ..., <itemN>]

      # The log body must match at least one of the items.
      # Currently only string body types are supported.
      # This is an optional field.
      log_bodies: [<item1>, ..., <itemN>]

      # The log severity text must match at least one of the items.
      # This is an optional field.
      log_severity_texts: [<item1>, ..., <itemN>]

      # The log severity number defines how to match against a log record's
      # SeverityNumber, if defined.
      # This is an optional field.
      log_severity_number:
        # Min is the lowest severity that may be matched.
        # e.g. if this is plog.SeverityNumberInfo, 
        # INFO, WARN, ERROR, and FATAL logs will match.
        min: <int>
        # MatchUndefined controls whether logs with "undefined" severity matches.
        # If this is true, entries with undefined severity will match.
        match_undefined: <bool>

      # The metric name must match at least one of the items.
      # This is an optional field.
      metric_names: [<item1>, ..., <itemN>]

      # Attributes specifies the list of attributes to match against.
      # All of these attributes must match for a match to occur.
      # This is an optional field.
      attributes:
          # Key specifies the attribute to match against.
        - key: <key>
          # Value specifies the value to match against.
          # If not specified, a match occurs if the key is present in the attributes.
          value: {value}

Match Configuration

Some match_type values have additional configuration options that can be specified. The match_type value is the name of the configuration section. These sections are optional.

# regexp is an optional configuration section for match_type regexp.
regexp:
  # cacheenabled determines whether match results are LRU cached to make subsequent matches faster.
  # Cache size is unlimited unless cachemaxnumentries is also specified.
  cacheenabled: <bool>
  # cachemaxnumentries is the max number of entries of the LRU cache; ignored if cacheenabled is false.
  cachemaxnumentries: <int>

Warnings

In general, the Attributes processor is a very safe processor to use. Care only needs to be taken when modifying data point attributes:

  • Identity Conflict: Reducing/changing existing data point attributes has the potential to create an identity conflict since the Attributes processor does not perform any re-aggregation of the data points. Adding new attributes to data points is safe.