+
Skip to content

Reduce the metrics cardinality #160

@ShoshinNikita

Description

@ShoshinNikita

Right now the metric http_response_time_seconds has one label - path, and r.URL.Path is used as its value. This can be an issue when a service behind Reproxy uses dynamic urls (for example, /api/v1/users/{user_id}). In this case, the response size of /metrics may become too large to process by Prometheus (especially when resources are limited).

We can't just drop this label because of the backward compatibility promise. However, we can make it optional. It would be enabled by default - again, to comply with the backward compatibility promise.


I also have a few more suggestions (more → less important):

  1. Add the label server to all metrics. It feels wrong that only http_requests_total has this label. For example, it prevents me from creating an alert for 500+ statuses for specific upstreams.
  2. Add metrics http_request_size_bytes{server} and http_response_size_bytes{server} that would allow users to monitor the size of requests and responses.
  3. Add new buckets for http_response_time_seconds - for example, 10 and 30. The current "highest" bucket is 5s, but it can take much longer to process some types of requests. At the same time, I understand that it's hard to find a good and fitting maximum value. Another option it to make the buckets configurable - but I am not sure it would be practical.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载