这是indexloc提供的服务,不要输入任何密码
Skip to content

Add pg_exporter connect PG instance timeout setting #25

@FeixiangZhao

Description

@FeixiangZhao

If we deploy the pigsty to monitor the PG instance which is in the same datacenter, the pg_exporter connect time out is 100ms.

But in the read prod env, we will have multi-region PG instance. If we develop one pigsty to monitor all the PG instances in the different regions, the pg_exporter will have this error:

Nov 24 10:27:07 staging-gcp-sg-vm-platform-pigsty-1 pg_exporter_staging-gcp-hk-pgsql12-platform-1-1[28699]: time="2021-11-24T10:27:07Z" level=error msg="fail connecting to primary server: fail fetching server version: driver: bad connection, retrying in 10s" source="pg_exporter.go:1521"
Nov 24 10:27:07 staging-gcp-sg-vm-platform-pigsty-1 pg_exporter: time="2021-11-24T10:27:07Z" level=error msg="fail connecting to primary server: fail fetching server version: driver: bad connection, retrying in 10s" source="pg_exporter.go:1521"

But in fact, we can use psql command line to connect the PG instance in the pigsty host.

The time out config in the pg_exporter is:

image
image

As we discuss:

100ms以上会主动取消,判定抓取失败,避免雪崩。之前我也没想到会有跨数据中心抓取的情况。
从这个抓包情况看,大概正好打到100ms的阈值。
+150ms已经返回结果了,但是还是因为超时而主动请求报错

-- 但对于这个 timeout 的阈值,下一个版本是否也可以设置成为一个可选参数,默认还是 100ms, 如果有这种跨region的情况,那可以调整大一些,比如调整为1s, 这样按说对于监控也够了,好处是 一个 region一个VM部署 pigsty,就可以监控所有 region的PG实例了

-- 
Reasonable
欢迎帮我提个Issue啊,https://github.com/Vonng/pg_exporter
我下个Release修改一下

So I suggest that in the pg_export, we can set the time pg_exporter time_out threshold.
By default, the value is 100ms.
In the special env, such as we use one pigsty to monitor multi-data center PG instance, we can increase the value.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions