+
Skip to content

Implement missing metrics #9125

Open
Open
@bitoku

Description

@bitoku

Some metrics which is supported in cAdvisor are missing.
This issue is to track what metrics are missing and figure out if we can implement in CRI-O.

https://github.com/google/cadvisor/blob/master/docs/storage/prometheus.md?plain=1

https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/2371-cri-pod-container-stats

In KEP 2371

CPU

  • container_cpu_cfs_periods_total | Counter | Number of elapsed enforcement period intervals | | cpu |
  • container_cpu_cfs_throttled_periods_total | Counter | Number of throttled period intervals | | cpu |
  • container_cpu_cfs_throttled_seconds_total | Counter | Total time duration the container has been throttled | seconds | cpu |
  • container_cpu_system_seconds_total | Counter | Cumulative system cpu time consumed | seconds | cpu |
  • container_cpu_usage_seconds_total | Counter | Cumulative cpu time consumed | seconds | cpu |
  • container_cpu_user_seconds_total | Counter | Cumulative user cpu time consumed | seconds | cpu |

CPU Load

  • container_cpu_load_average_10s | Gauge | Value of container cpu load average over the last 10 seconds | | cpuLoad |
  • container_tasks_state | Gauge | Number of tasks in given state (sleeping, running, stopped, uninterruptible, or ioawaiting) | | cpuLoad |

Disk

  • container_fs_inodes_free | Gauge | Number of available Inodes | | disk |
  • container_fs_inodes_total | Gauge | Total number of Inodes | | disk |
  • container_fs_limit_bytes | Gauge | Number of bytes that can be consumed by the container on this filesystem | bytes | disk |
  • container_fs_usage_bytes | Gauge | Number of bytes that are consumed by the container on this filesystem | bytes | disk |

DiskIO

  • container_fs_io_current | Gauge | Number of I/Os currently in progress | | diskIO |
  • container_fs_io_time_seconds_total | Counter | Cumulative count of seconds spent doing I/Os | seconds | diskIO |
  • container_fs_io_time_weighted_seconds_total | Counter | Cumulative weighted I/O time | seconds | diskIO |
  • container_fs_reads_bytes_total | Counter | Cumulative count of bytes read | bytes | diskIO |
  • container_fs_read_seconds_total | Counter | Cumulative count of seconds spent reading | | diskIO |
  • container_fs_reads_merged_total | Counter | Cumulative count of reads merged | | diskIO |
  • container_fs_reads_total | Counter | Cumulative count of reads completed | | diskIO |
  • container_fs_sector_reads_total | Counter | Cumulative count of sector reads completed | | diskIO |
  • container_fs_sector_writes_total | Counter | Cumulative count of sector writes completed | | diskIO |
  • container_fs_writes_bytes_total | Counter | Cumulative count of bytes written | bytes | diskIO |
  • container_fs_write_seconds_total | Counter | Cumulative count of seconds spent writing | seconds | diskIO |
  • container_fs_writes_merged_total | Counter | Cumulative count of writes merged | | diskIO |
  • container_fs_writes_total | Counter | Cumulative count of writes completed | | diskIO |

Process

  • container_file_descriptors | Gauge | Number of open file descriptors for the container | | process |
  • container_processes | Gauge | Number of processes running inside the container | | process |
  • container_sockets | Gauge | Number of open sockets for the container | | process |
  • container_threads | Gauge | Number of threads running inside the container | | process |
  • container_threads_max | Gauge | Maximum number of threads allowed inside the container | | process |
  • container_ulimits_soft | Gauge | Soft ulimit values for the container root process. Unlimited if -1, except priority and nice | | process |

Memory

  • container_memory_cache | Gauge | Total page cache memory | bytes | memory |
  • container_memory_failcnt | Counter | Number of memory usage hits limits | | memory |
  • container_memory_failures_total | Counter | Cumulative count of memory allocation failures | | memory |
  • container_memory_mapped_file | Gauge | Size of memory mapped files | bytes | memory |
  • container_memory_max_usage_bytes | Gauge | Maximum memory usage recorded | bytes | memory |
  • container_memory_rss | Gauge | Size of RSS | bytes | memory |
  • container_memory_swap | Gauge | Container swap usage | bytes | memory |
  • container_memory_usage_bytes | Gauge | Current memory usage, including all memory regardless of when it was accessed | bytes | memory |
  • container_memory_working_set_bytes | Gauge | Current working set | bytes | memory |

Network

  • container_network_receive_bytes_total | Counter | Cumulative count of bytes received | bytes | network |
  • container_network_receive_errors_total | Counter | Cumulative count of errors encountered while receiving | | network |
  • container_network_receive_packets_dropped_total | Counter | Cumulative count of packets dropped while receiving | | network |
  • container_network_receive_packets_total | Counter | Cumulative count of packets received | | network |
  • container_network_transmit_bytes_total | Counter | Cumulative count of bytes transmitted | bytes | network |
  • container_network_transmit_errors_total | Counter | Cumulative count of errors encountered while transmitting | | network |
  • container_network_transmit_packets_dropped_total | Counter | Cumulative count of packets dropped while transmitting | | network |
  • container_network_transmit_packets_total | Counter | Cumulative count of packets transmitted | | network |

OOM Event

  • container_oom_events_total | Counter | Count of out of memory events observed for the container | | oom_event |

Others

  • container_last_seen | Gauge | Last time a container was seen by the exporter | timestamp | - |
  • container_spec_cpu_period | Gauge | CPU period of the container | | - |
  • container_spec_cpu_quota | Gauge | CPU quota of the container | | - |
  • container_spec_cpu_shares | Gauge | CPU share of the container | | - |
  • container_spec_memory_limit_bytes | Gauge | Memory limit for the container | bytes | - |
  • container_spec_memory_reservation_limit_bytes | Gauge | Memory reservation limit for the container | bytes | |
  • container_spec_memory_swap_limit_bytes | Gauge | Memory swap limit for the container | bytes | |
  • container_start_time_seconds | Gauge | Start time of the container since unix epoch | seconds | |

Not in KEP 2371

CPU Set

  • container_memory_migrate | Gauge | Memory migrate status | | cpuset |

Disk IO

  • container_blkio_device_usage_total | Counter | Blkio device bytes usage | bytes | diskIO |

sched

  • container_cpu_schedstat_run_periods_total | Counter | Number of times processes of the cgroup have run on the cpu | | sched |
  • container_cpu_schedstat_runqueue_seconds_total | Counter | Time duration processes of the container have been waiting on a runqueue | seconds | sched |
  • container_cpu_schedstat_run_seconds_total | Counter | Time duration the processes of the container have run on the CPU | seconds | sched |

advtcp

  • container_network_advance_tcp_stats_total | Gauge | advanced tcp connections statistic for container | | advtcp |

TCP

  • container_network_tcp6_usage_total | Gauge | tcp6 connection usage statistic for container | | tcp |
  • container_network_tcp_usage_total | Gauge | tcp connection usage statistic for container | | tcp |

UDP

  • container_network_udp6_usage_total | Gauge | udp6 connection usage statistic for container | | udp |
  • container_network_udp_usage_total | Gauge | udp connection usage statistic for container | | udp |

libpfm

  • container_perf_events_scaling_ratio | Gauge | Scaling ratio for perf event counter (event can be identified by event label and cpu indicates the core for which event was measured). See perf event configuration. | | perf_event | libpfm
  • container_perf_events_total | Counter | Scaled counter of perf core event (event can be identified by event label and cpu indicates the core for which event was measured). See perf event configuration. | | perf_event | libpfm
  • container_perf_uncore_events_scaling_ratio | Gauge | Scaling ratio for perf uncore event counter (event can be identified by event label, pmu and socket lables indicate the PMU and the CPU socket for which event was measured). See perf event configuration. Metric exists only for main cgroup (id="/"). | | perf_event | libpfm
  • container_perf_uncore_events_total | Counter | Scaled counter of perf uncore event (event can be identified by event label, pmu and socket lables indicate the PMU and the CPU socket for which event was measured). See perf event configuration). Metric exists only for main cgroup (id="/").| | perf_event | libpfm

Referenced Memory

  • container_referenced_bytes | Gauge | Container referenced bytes during last measurements cycle based on Referenced field in /proc/smaps file, with /proc/PIDs/clear_refs set to 1 after defined number of cycles configured through referenced_reset_interval cAdvisor parameter.
    Warning: this is intrusive collection because can influence kernel page reclaim policy and add latency. Refer to https://github.com/brendangregg/wss#wsspl-referenced-page-flag for more details. | bytes | referenced_memory |

Memory NUMA

  • container_memory_numa_pages | Gauge | Number of used pages per NUMA node | | memory_numa |

hugetlb

  • container_hugetlb_failcnt | Counter | Number of hugepage usage hits limits | | hugetlb |
  • container_hugetlb_max_usage_bytes | Gauge | Maximum hugepage usages recorded | bytes | hugetlb |
  • container_hugetlb_usage_bytes | Gauge | Current hugepage usage | bytes | hugetlb |

resctrl

  • container_llc_occupancy_bytes | Gauge | Last level cache usage statistics for container counted with RDT Memory Bandwidth Monitoring (MBM). | bytes | resctrl |
  • container_memory_bandwidth_bytes | Gauge | Total memory bandwidth usage statistics for container counted with RDT Memory Bandwidth Monitoring (MBM). | bytes | resctrl |
  • container_memory_bandwidth_local_bytes | Gauge | Local memory bandwidth usage statistics for container counted with RDT Memory Bandwidth Monitoring (MBM). | bytes | resctrl |

Metadata

Metadata

Assignees

No one assigned

    Labels

    good first issueDenotes an issue ready for a new contributor, according to the "help wanted" guidelines.kind/featureCategorizes issue or PR as related to a new feature.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载