-
Notifications
You must be signed in to change notification settings - Fork 11
Description
We will change all plugins to default to a single tp_index value of default
tailpipe compact should be modified to re-index the files to use the new scheme - eg if you have already collected cloudtrail logs and the tp_index is by account id, we will change the tp_index for all rows to default, and then compact the files into that new hive structure (all accounts in a single file per day, in the tp_index=default folder). In the event that the user defines a tp_index (see below), tailpipe compact should re-index to that scheme instead.
A user may optionally choose to use a column as an index on a per-partition basis, eg:
partition "aws_cloudtrail_log" "s3_bucket_us_east_1" {
source "aws_s3_bucket" {
connection = connection.aws.account_a
bucket = "aws-cloudtrail-logs-account-a"
file_layout = `AWSLogs/(%{DATA:org_id}/)?%{NUMBER:account_id}/CloudTrail/us-east-1/%{DATA}.json.gz`
}
tp_index = "account_id"
}
The user may create a "composite index" by using a function instead. The syntax should be the same as for the transform column argument:
partition "aws_cloudtrail_log" "s3_bucket_us_east_1" {
source "aws_s3_bucket" {
connection = connection.aws.account_a
bucket = "aws-cloudtrail-logs-account-a"
file_layout = `AWSLogs/(%{DATA:org_id}/)?%{NUMBER:account_id}/CloudTrail/us-east-1/%{DATA}.json.gz`
}
tp_index = "concat(account_id, '_', region)"
}