Succinct Data Structures for Retrieval and Approximate Membership

Martin Dietzfelbinger, Rasmus Pagh

    Research output: Journal Article or Conference Article in JournalConference articleResearchpeer-review

    Abstract

    The retrieval problem is the problem of associating data with keys in a set. Formally, the data structure must store a function that has specified values on the elements of a given set S ⊆ U, |S| = n, but may have any value on elements outside S. All known methods (e. g. those based on perfect hash functions), induce a space overhead of Θ(n) bits over the optimum, regardless of the evaluation time. We show that for any k, query time O(k) can be achieved using space that is within a factor 1 + e − k of optimal, asymptotically for large n. The time to construct the data structure is O(n), expected. If we allow logarithmic evaluation time, the additive overhead can be reduced to O(loglogn) bits whp. A general reduction transfers the results on retrieval into analogous results on approximate membership, a problem traditionally addressed using Bloom filters. Thus we obtain space bounds arbitrarily close to the lower bound for this problem as well. The evaluation procedures of our data structures are extremely simple. For the results stated above we assume free access to fully random hash functions. This assumption can be justified using space o(n) to simulate full randomness on a RAM.
    Original languageEnglish
    Book seriesLecture Notes in Computer Science
    Pages (from-to)385-396
    Number of pages12
    ISSN0302-9743
    DOIs
    Publication statusPublished - 2008
    EventICALP 2008 35th International Colloquium on Automata, Languages and Programming - Reykjavik, Iceland
    Duration: 6 Jul 200813 Jul 2008
    Conference number: 35

    Conference

    ConferenceICALP 2008 35th International Colloquium on Automata, Languages and Programming
    Number35
    Country/TerritoryIceland
    CityReykjavik
    Period06/07/200813/07/2008

    Keywords

    • retrieval problem
    • data structure
    • perfect hash functions
    • approximate membership
    • Bloom filters

    Cite this