-
Notifications
You must be signed in to change notification settings - Fork 51
Closed
Description
If we deploy the pigsty to monitor the PG instance which is in the same datacenter, the pg_exporter connect time out is 100ms.
But in the read prod env, we will have multi-region PG instance. If we develop one pigsty to monitor all the PG instances in the different regions, the pg_exporter will have this error:
Nov 24 10:27:07 staging-gcp-sg-vm-platform-pigsty-1 pg_exporter_staging-gcp-hk-pgsql12-platform-1-1[28699]: time="2021-11-24T10:27:07Z" level=error msg="fail connecting to primary server: fail fetching server version: driver: bad connection, retrying in 10s" source="pg_exporter.go:1521"
Nov 24 10:27:07 staging-gcp-sg-vm-platform-pigsty-1 pg_exporter: time="2021-11-24T10:27:07Z" level=error msg="fail connecting to primary server: fail fetching server version: driver: bad connection, retrying in 10s" source="pg_exporter.go:1521"
But in fact, we can use psql command line to connect the PG instance in the pigsty host.
The time out config in the pg_exporter is:
As we discuss:
100ms以上会主动取消,判定抓取失败,避免雪崩。之前我也没想到会有跨数据中心抓取的情况。
从这个抓包情况看,大概正好打到100ms的阈值。
+150ms已经返回结果了,但是还是因为超时而主动请求报错
-- 但对于这个 timeout 的阈值,下一个版本是否也可以设置成为一个可选参数,默认还是 100ms, 如果有这种跨region的情况,那可以调整大一些,比如调整为1s, 这样按说对于监控也够了,好处是 一个 region一个VM部署 pigsty,就可以监控所有 region的PG实例了
--
Reasonable
欢迎帮我提个Issue啊,https://github.com/Vonng/pg_exporter
我下个Release修改一下
So I suggest that in the pg_export, we can set the time pg_exporter time_out threshold.
By default, the value is 100ms.
In the special env, such as we use one pigsty to monitor multi-data center PG instance, we can increase the value.
Vonng
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request