Skip to content

[Tiered caching][Milestone 1] Exposing a new cache stats API #12258

@sgup432

Description

@sgup432

Is your feature request related to a problem? Please describe

Overview

As part of tiered caching milestone 1(#10870), we are looking to extend IndicesRequestCache with additional caching tiers like disk.
We will need to introduce new stats to expose "per tier" stats in addition to whatever we have.

Current state

Taking IndicesRequestCache as an example, we expose it as part of node indices stats and index stats. It exposes level query parameter which can be used to aggregate stats on dimensions like shardId, indices. Examples below:

  • Node stats -

    • GET /_nodes/stats/indices/request_cache
      • Request cache stats for all nodes in the cluster.
    • GET /_nodes/stats/indices/request_cache?level=indices
      • Aggregating at indices level
    • GET /_nodes/stats/indices/request_cache?level=shard
      • Aggregating at shard level.
  • Index stats - Similar to above, just that this is meant to aggregate only at indices level.

    • GET /<index-name>/_stats/request_cache
    • GET /<index-name>/_stats/request_cache?level=shards

Describe the solution you'd like

Maintaining cache stats for respective tiers with key based dimension support

Considering we are introducing new cache interfaces inside OpenSearch which can be used to implement any caching tier like onHeap, disk etc, we can plan to have a new cache stats under node stats. This way any new each cache tier will maintain its own stats decoupling the logic from consumer using these underlying caches.

Backward compatibility: To main backward compatibility, we will continue to support existing indices request_cache stats like the way it exists today.

API Details below.

Cache stats API (new API):

  • Request

    • GET /_nodes/stats/caches?pretty
    • GET /_nodes/stats/caches/<cache_type>?pretty
    • GET /_nodes/stats/caches/<cache_type>?level=dimension1,dimension2&pretty
  • Path parameters

    • <cache_type>: (Optional, string). Limits the information to a specific cache type within OpenSearch. For example, IndicesRequestCache, QueryCache etc.
  • Query parameters

    • level: (Optional, string): Indicates a dimension for which stats are aggregated for a specific cache type.
      • tier value for level is shared across all cache types.
      • Rest of the dimensions are specific to desired cache types:
        • cache_type: request_cache
          • Valid values for level are:
            • shards, “shardId” → index1[0]
            • tier
            • indices
            • tier, indices
            • tier, shards
  • Response body

    • caches - (Object) Contains stats for desired caches present in OpenSearch.
      • request_cache: (Object): Contains stats for IndicesRequestCache cache type.
        • memory_size_in_bytes
        • evictions
        • hit_count
        • miss_count
        • entries

Examples:
RequestCache stats, Dimension is shard, tier:
GET /_nodes/stats/caches/request_cache?level=tier,shards&pretty

"caches": {
   "request_cache": { // one cache type
        "memory_size_in_bytes" : 3,
        "evictions" : 1,
        "hit_count" : 4,
        "miss_count" : 3,
        "entries" : 2
        "shards": {
            "index1[0]": {
                "tier": {
                    "onheap": {
                        "memory_size_in_bytes" : 2,
                        "evictions" : 1,
                        "hit_count" : 2,
                        "miss_count" : 2,
                        "entries": 1
                    },
                    "disk": {
                        "memory_size_in_bytes" : 1,
                        "evictions" : 0,
                        "hit_count" : 2,
                        "miss_count" : 1,
                        "entries": 1
                    }
                }
            }
        }
    },
    "query_cache": {} // Other cache type
}

Related component

Search:Performance

Describe alternatives you've considered

Update the existing Indices request stats to include per tier stats.
Something like

GET /_node/stats/indices/request_cache?pretty

"request_cache" : {
    "tier": {
        "onheap": {
            "memory_size_in_bytes" : 331136248,
            "evictions" : 47033,
            "hit_count" : 89953,
            "miss_count" : 3328005
        },
        "disk": {
            "memory_size_in_bytes" : 331136248,
            "evictions" : 47033,
            "hit_count" : 89953,
            "miss_count" : 3328005
        }
     }
 }

One of the con of this approach is that each consumer(like IndicesRequestCache) will have to handle their own stats writing logic instead of this being taken care by the underlying caching tier(heap/disk etc) itself.

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    Status

    ✅ Done

    Status

    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions