Massive Data Mining by Sampling

  • Pagh, Rasmus (PI)
  • Stöckel, Morten (CoI)
  • Pham, Ninh Dang (CoI)

Project: Research

Project Details


Data mining is the driving force behind a paradigm shift in the way we conceive models of our surroundings. Society relies extensively on models of physical, social, and economic phenomena, that give the predictive power allowing us to build a bridge, confident that it will not collapse, forecast the effect of economic stimulus packages, select the best livestock for breeding, etc. The massive amount of data that has become available opens up for completely new ways of understanding and modeling data, through the use of algorithms.

In this project we will explore the possibilities in a new approach to massive data mining with origins in advanced sampling methods of data stream algorithmics. This happy marriage of efficient algorithms and statistical principles is the unifying idea of several recent highly successful research contributions of the applicant in the field of data mining. In collaboration with external partners, the project will focus on three application areas with a common theoretical core: Financial modeling, recommendation systems, and genotype/phenotype mining.
Effective start/end date01/01/201131/12/2014


Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.
  • Best Paper Award

    Ninh Dang Pham (Participant)

    7 Apr 201411 Apr 2014

    Activity: Other activity typesOther (prizes, external teaching and other activities) - Prizes, scholarships, distinctions