Identifying the noisy topics

It seems to me that the current specs of the API may enable a simple and practical attack to identify the noisy topics, which could thus be filter out by the DSPs.

This attack relies on those two rules:
- _"The caller only receives topics it has observed the user visit in the past."_
- _"The exception to this filtering is the 5% random topic, that topic will not be filtered."_

A direct consequence of those rules is that if a caller never observed any user before, then any topic it would receive is a random topic.

An attacker could thus call the API with two distinct endpoints:

- one regular endpoint, observing as much of the web as possible, to get as many user topics as possible (This is just the regular API use).
- an attack endpoint, which have never observed the user before. Any topic returned to this endpoint is a random topic, and should be filtered out from the result of the regular query.

Ensuring that an endpoint never observed the user may be non trivial, but a simple proxy would be to use as a caller the site the user in on. Any topic returned to this caller which is not the topic assigned to that site is thus a random topic.

What are your thoughts on this?


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Identifying the noisy topics #75

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Identifying the noisy topics #75

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions