On finding frequent patterns in event sequences

Andrea Campagna, Rasmus Pagh

    Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

    Abstract

    Given a directed acyclic graph with labeled vertices, we consider the problem of finding the most common label sequences (``traces'') among all paths in the graph (of some maximum length $m$). Since the number of paths can be huge, we propose novel algorithms whose time complexity depends only on the size of the graph, and on the frequency $\varepsilon$ of the most frequent traces. In addition, we apply techniques from streaming algorithms to achieve space usage that depends only on $\varepsilon$, and not on the number of distinct traces.

    The abstract problem considered models a variety of tasks concerning finding frequent patterns in event sequences. Our motivation comes from working with a data set of 2 million RFID readings from baggage trolleys at Copenhagen Airport. The question of finding frequent passenger movement patterns is mapped to the above problem. We report on experimental findings for this data set.
    Original languageEnglish
    Title of host publicationICDM 2010 : Proceedings of the Tenth IEEE International Conference on Data Mining
    Number of pages6
    PublisherIEEE
    Publication date14 Dec 2010
    Publication statusPublished - 14 Dec 2010
    EventIEEE Internation conference on data mining - Sydney, Australia
    Duration: 14 Dec 201017 Dec 2010
    http://www.cs.uvm.edu/~icdm/

    Conference

    ConferenceIEEE Internation conference on data mining
    Country/TerritoryAustralia
    CitySydney
    Period14/12/201017/12/2010
    Internet address

    Keywords

    • Directed acyclic graph
    • Frequent pattern mining
    • Trace analysis
    • Streaming algorithms
    • Event sequences

    Fingerprint

    Dive into the research topics of 'On finding frequent patterns in event sequences'. Together they form a unique fingerprint.

    Cite this