Support for specifying multiple input prefixes in Aggregation Service CreateJob requests: Feedback Requested

## Problem
The aggregation service team has heard feedback (#67) from our partners about the difficulties they face in submitting job to aggregation service. This is due to the rigid input prefix requirements of the API that might not work well with the data organization pattern chosen by an adtech.

Currently, aggregation service's CreateJob API accepts a single value for the parameter `input_data_blob_prefix`. This parameter refers to the input prefix where the aggregatable reports are stored. The aggregation job includes all the reports stored under this prefix hierarchy when generating the summary report.
This seems to be a problem in situations when only a subset of the reports stored under a prefix are intended to be used in an aggregation job, yet the said prefix is the only common prefix for all such reports. An example of such a situation can be found in issue #67.
The current workaround for this issue (suggested [here](https://github.com/privacysandbox/aggregation-service/issues/67#issuecomment-2377168136)) is that adtechs need to reorganize their reports such that all the reports under a given input prefix are only the ones that are intended to be aggregated in the given aggregation job. When adtech’s data organization pattern is different from their querying pattern, they need to either copy the reports or move them around to meet the above requirements. This is error prone, time consuming and, in case of report copying, also leads to higher storage costs.

## Proposal
To address this problem, we are proposing the following changes to the CreateJob API.
- Introduction of a new field `input_data_blob_prefix_list` which would accept a list of input prefixes under which the aggregatable reports for a job are stored. Aggregation worker would read all reports stored under each of the prefixes provided in the list and include their contributions in the generated summary report.
- This field would accept a list with a maximum size of 50 entries. This number can be increased in future based on adtech feedback.
- **[Backwards compatibility]** 
  - We would be introducing this field in a backwards compatible way. This will be an optional field in the current version of the API.
  - **Exactly one of the two fields** `input_data_blob_prefix` and `input_data_blob_prefix_list` would be required to be specified in the CreateJob request.
  - Users of aggregation service who do not see the need to specify multiple input prefixes can continue using the field `input_data_blob_prefix`.

### API changes

#### Current CreateJob API request payload schema
```
{
    // other fields of CreateJob request

    "input_data_bucket_name": "my-bucket",
    "input_data_blob_prefix": "my-month/my-day/",
    "job_parameters": {
          // fields inside this json object
     }
}
```

#### Proposed CreateJob API request payload schema
```
{
    // other fields of CreateJob request

    "input_data_bucket_name": "my-bucket",
    "input_data_blob_prefix": "my-month/my-day", //should be absent if input_data_blob_prefix_list is provided
    "input_data_blob_prefix_list": ["my-month/my-day/hour-00",
                                    "my-month/my-day/hour-01",
                                    "my-month/my-day/hour-02"
                                    "my-month/my-day/hour-03"
                                    "my-month/my-day/hour-04"], //optional field
     "job_parameters": {
         // fields inside this json object
     }
}
```

## Feedback request
If you have any feedback on the above proposal, please let us know by responding to this issue.

**We would really appreciate your feedback on these API changes. In particular:**
1. Would adtechs find this feature useful?
2. We're proposing a limit of 50 on the number of input prefixes. Do adtechs find this limit sufficient for their use cases?






Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support for specifying multiple input prefixes in Aggregation Service CreateJob requests: Feedback Requested #76

Problem

Proposal

API changes

Current CreateJob API request payload schema

Proposed CreateJob API request payload schema

Feedback request

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Support for specifying multiple input prefixes in Aggregation Service CreateJob requests: Feedback Requested #76

Description

Problem

Proposal

API changes

Current CreateJob API request payload schema

Proposed CreateJob API request payload schema

Feedback request

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions