这是indexloc提供的服务,不要输入任何密码
Skip to content

Conversation

@agelwarg
Copy link

@agelwarg agelwarg commented Aug 13, 2025

For high (N)VPS deployments where a proxy is deployed and responsible for lots of data, housekeeping on the proxy can become a bottleneck requiring lots of cpu/memory resources. As such, we're looking to add TimescaleDB (partitioning) support to the proxy_history table, along with modifications to the housekeeping process, similar to what the zabbix server supports. However, in this case we use the id column for partitioning since it's an ever-increasing value, with a chunk_time_interval of 1,000,000.

This should be paired with zabbix/zabbix-docker#1755

for 'proxy_history' table through timescaledb, similar to zabbix_server.
@agelwarg
Copy link
Author

The current state of this PR is enough to demonstrate the significant (positive) impact, but there are open questions that should be reviewed / addressed. I will call the ones I am aware of in a review

for ("proxy_history")
{
print<<EOF
PERFORM create_hypertable('$_', 'id', chunk_time_interval => 1000000, $flags);
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In our testing, 1000000 is a good balance between low throughput and high throughput systems, such that it may be a few hours before a partition is dropped on a small proxy, while not too many partitions are created between housekeeping executions for large proxy.

Comment on lines 79 to 81
const char* enable_timescale = getenv("ENABLE_TIMESCALEDB");

if (0 == strcmp(table,"proxy_history") && 0 == strcmp(enable_timescale, "true"))
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The condition to decide if partition management should be used, instead of delete from... is much simpler than on the server side as we do not need to worry about compression or overrides, as we should never use compression on the proxy, nor are any global overrides for retention relevant. However, relying on an environment variable is probably not the best answer. Furthermore, I don't know if relying on a value in the config table is the right answer either (like the server housekeeper does), especially since the config table doesn't look like it's even populated or ever used on a proxy.

Comment on lines 85 to 100
if (0 != config_local_buffer)
condition = zbx_dsprintf(NULL, " or %s>=%d", clock_field, now - config_local_buffer * SEC_PER_HOUR);

result = zbx_db_select(
"select coalesce(min(id),%d) from %s"
" where (id>" ZBX_FS_UI64 " and %s>=%d) %s",
maxid + 1, table, lastid,
clock_field, now - config_offline_buffer * SEC_PER_HOUR,
ZBX_NULL2EMPTY_STR(condition));
zbx_free(condition);

if (NULL == (row = zbx_db_fetch(result)) || SUCCEED == zbx_db_is_null(row[0]))
goto rollback;

ZBX_STR2UINT64(keep_from, row[0]);
zbx_db_free_result(result);
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's we're building a query to determine the minimum id that we should retain data from, in a somewhat inverse way that the delete below is built.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we refactor keep_from to be the minimum id of the partition (timescaledb chunk) that our currently calculated keep_from is, then the proper partitions will still be dropped AND the count of records that will be returned (determined below) will be correct, which ultimately gets logged later on.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactored keep_from to be floor(coalesce(min(id),%d)/1000000)*1000000

@agelwarg agelwarg changed the title Add proxy support for TimescalDB to drop partitions Add proxy support for TimescaleDB to drop partitions Aug 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant