这是indexloc提供的服务,不要输入任何密码
Skip to content

Identifying the noisy topics #75

@AlexandreGilotte

Description

@AlexandreGilotte

It seems to me that the current specs of the API may enable a simple and practical attack to identify the noisy topics, which could thus be filter out by the DSPs.

This attack relies on those two rules:

  • "The caller only receives topics it has observed the user visit in the past."
  • "The exception to this filtering is the 5% random topic, that topic will not be filtered."

A direct consequence of those rules is that if a caller never observed any user before, then any topic it would receive is a random topic.

An attacker could thus call the API with two distinct endpoints:

  • one regular endpoint, observing as much of the web as possible, to get as many user topics as possible (This is just the regular API use).
  • an attack endpoint, which have never observed the user before. Any topic returned to this endpoint is a random topic, and should be filtered out from the result of the regular query.

Ensuring that an endpoint never observed the user may be non trivial, but a simple proxy would be to use as a caller the site the user in on. Any topic returned to this caller which is not the topic assigned to that site is thus a random topic.

What are your thoughts on this?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions