Derandomization, Hashing and Expanders

Milan Ruzic

    Publikation: Bog / Antologi / Rapport / Ph.D.-afhandlingPh.d.-afhandling


    Regarding complexity of computation, randomness is a signicant resource beside time and space.
    Particularly from a theoretical viewpoint, it is a fundamental question whether availability of ran-
    dom numbers gives any additional power. Most of randomized algorithms are analyzed under the
    assumption that independent and unbiased random bits are accessible. However, truly random
    bits are scarce in reality. In practice, pseudorandom generators are used in place of random
    numbers; usually, even the seed of the generator does not come from a source of true random-
    ness. While things mostly work well in practice, there are occasional problems with use of weak
    pseudorandom generators. Further, randomized algorithms are not suited for applications where
    reliability is a key concern.
    Derandomization is the process of minimizing the use of random bits, either to small amounts
    or removing them altogether. We may identify two lines of work in this direction. There has
    been a lot of work in designing general tools for simulating randomness and making deterministic
    versions of randomized algorithms, with some loss in time and space performance. These methods
    are not tied to particular algorithms, but work on large classes of problems. The central question
    in this area of computational complexity is \P=BPP?".
    Instead of derandomizing whole complexity classes, one may work on derandomizing concrete
    problems. This approach trades generality for possibility of having much better performance
    bounds. There are a few common techniques for derandomizing concrete problems, but often one
    needs to specically design a new method that is \friendlier" to deterministic computation. This
    kind of solutions prevails in this thesis.
    A central part of the thesis are algorithms for deterministic selection of hash functions that
    have a \nicely spread" image of a given set. The main application is design of ecient dictionary
    data structures. A dictionary stores a set of keys from some universe, and allows the user to
    search through the set, looking for any value from the universe. Additional information may be
    associated with each key and retrieved on successful lookup. In a static dictionary the stored
    set remains xed after initialization, while in a dynamic dictionary it may change over time.
    Our static dictionaries attain worst-case performance that is very close the expected performance
    of best randomized dictionaries. In the dynamic case the gap is larger; it is a signicant open
    question to establish if a gap between deterministic and randomized dynamic dictionaries is
    We also have a new analysis of the classical linear probing hash tables, showing that it works
    well with simple and eciently computable hash functions. Here we have a randomized structure
    in which the randomness requirements are cut down to a reasonable level. Traditionally, linear
    probing was analyzed under the unrealistic uniform hashing assumption that the hash function
    employed behaves like a truly random function. This was later improved to explicit, but cum-
    bersome and inecient families of functions. Our analysis shows that practically usable hash
    functions suce, but the simplest kinds of functions do not.
    Apart from dictionaries, we look at the problem of sparse approximations of vectors, which
    has applications in dierent areas such as data stream computation and compressed sensing. We
    present a method that achieves close to optimal performance on virtually all attributes. It is
    deterministic in the sense that a single measurement matrix works for all inputs.
    One of our dictionary results and the result on sparse recovery of vectors share an impor-
    tant tool, although the problems are unrelated. The shared tool is a type of expander graphs.
    We employ bipartite expander graphs, with unbalanced sides. For some algorithms, expander
    graphs capture all the required \random-like" properties. In such cases they can replace use of
    randomness, while maintaining about the same performance of algorithms.
    The problems that we study require and allow fast solutions. The algorithms involved have
    linear or near-linear running times. Even sublogarithmic factors in performance bounds are
    meaningful. With such high demands, one has to look for specic deterministic solutions that
    are ecient for particular problems; the general derandomization tools would be of no use.
    ForlagIT-Universitetet i København
    StatusUdgivet - 2009


    Dyk ned i forskningsemnerne om 'Derandomization, Hashing and Expanders'. Sammen danner de et unikt fingeraftryk.