这是indexloc提供的服务,不要输入任何密码
Skip to content

Private model training: Improving the efficacy of modelingSignals #1017

@csharrison

Description

@csharrison

This issue aims at helping improve the support for bid optimization in Protected Audiences without impacting the privacy stance of the API. This use-case typically involves:

  • Predicting an outcome associated with serving a certain ad (e.g. a predicted click through rate or conversion rate)
  • Varying bid price to optimize this outcome

Models to learn these predictions are typically trained via supervised learning techniques, i.e. where examples are labeled with an outcome (click, conversion, etc).

There are two techniques we are exploring to improve the status quo here:

  1. A mechanism where modelingSignals can be encrypted and processed in a trusted server environment, where we can offer private model training algorithms.
  2. An improved privacy mechanism to release modelingSignals directly to reportWin. This could look like changes to the existing randomized response mechanism.

Of these two techniques, we think (1) will provide the most utility for this use-case, although it introduces the most complexity to the system.

I am filing this issue to collect feedback about the model training use-case. I think we have a pretty good understanding of the shortcomings of the existing modelingSignals approach (mainly from a low dimensionality standpoint). However, there are lots of auxiliary use-cases / developer journeys that are involved with training models, these include:

  • Feature engineering and A/B testing new features: the ability to support trying out new features and seeing how they perform in offline evals and live experiments.
  • Feature transforms and backtesting new features: the ability to support offline evals with a new feature on old data, which might require some step to transform old examples to use the new feature.
  • Hyperparameter tuning
  • Debugging / monitoring

We’re interested in better understanding these kinds of use-cases. What are we missing? Please let us know, through this issue, if there are other use-cases we should consider when thinking through improvements here.

cc @nikunj101 @michaelkleber

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions