+
Skip to content

Best ways to handle failure to find a collision-free hash #6

@sqs

Description

@sqs

(I apologize if this is out of scope for this library, and for my relatively shallow understanding of MPHs.)

Using the FNV hash func and even some simple inputs (e.g., keys c and c2, where c is any character), the builder often fails to find a collision-free hash. This is not a limitation of this particular project; implementations of the same algorithm in other languages also frequently fail on this class of inputs. (I say "often" and "frequently" because the random seed is the current time, and sometimes it succeeds.)

Is there a generally accepted solution for handling this occurrence, such as retrying the generation with a different random seed? Or allowing a small number of collisions (and iterating over the collisions in Get) if the cardinality is small? (It might make the most sense to add support for these things in external code, not in this core mph lib. But I notice that gperf supports producing imperfect hashes, so there is some precedent for including them.)

So, basically, how do other people handle the failure to find a collision-free hash in their own code? (Just fall back to a map or another hashing scheme?)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载