Entropy as a Measure of Log Variability

Christoffer Olling Back, Søren Debois, Tijs Slaats

Research output: Journal Article or Conference Article in JournalJournal articleResearchpeer-review

Abstract

Process mining algorithms fall in two classes: imperative miners output flow diagrams, showing all possible paths, whereas declarative miners output constraints, showing the rules governing a process. But given a log, how do we know which of the two to apply? Assuming that logs exhibiting a large degree of variability are more suited for declarative miners, we can attempt to answer this question by defining a suitable measure of the variability of the log. This paper reports on an exploratory study into the use of entropy measures as metrics of variability. We survey notions of entropy used, e.g. in physics; we propose variant notions likely more suitable for the field of process mining; we provide an implementation of every entropy notion discussed; and we report entropy measures for a collection of both synthetic and real-life logs. Finally, based on anecdotal indications of which logs are better suited for declarative/imperative mining, we identify the most promising measures for future studies. For estimating overall entropy, global block and k-nearest neighbour estimators of entropy appear most promising and excel at identifying noise in logs. For estimating entropy rate we identify Lempel–Ziv and certain variants of k-block estimators performing well, and note that the former is more stable, but sensitive to noise, while the latter is less stable, being sensitive to cut-off constraints determining block size.
Original languageEnglish
JournalJournal on Data Semantics
Pages (from-to)1-28
ISSN1861-2032
DOIs
Publication statusPublished - 14 Jun 2019

Keywords

  • Knowledge Work
  • Entropy
  • Information Theory
  • Process Flexibility
  • Process Variability
  • Process Mining
  • Hybrid Models

Fingerprint

Dive into the research topics of 'Entropy as a Measure of Log Variability'. Together they form a unique fingerprint.
  • Towards an Entropy-based Analysis of Log Variability

    Back, C. O., Debois, S. & Slaats, T., 17 Jan 2018, International Conference on Business Process Management: BPM 2017: Business Process Management Workshops. Springer, p. 53-70 (Lecture Notes in Business Information Processing, Vol. 308).

    Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

    Open Access
    File

Cite this