TY - ICOMM
T1 - Integer Set Compression and Statistical Modeling
AU - Larsson, N. Jesper
PY - 2014
Y1 - 2014
N2 - Compression of integer sets and sequences has been extensively studied for settings where elements follow a uniform probability distribution. In addition, methods exist that exploit clustering of elements in order to achieve higher compression performance. In this work, we address the case where enumeration of elements may be arbitrary or random, but where statistics is kept in order to estimate probabilities of elements. We present a recursive subset-size encoding method that is able to benefit from statistics, explore the effects of permuting the enumeration order based on element probabilities, and discuss general properties and possibilities for this class of compression problem.
AB - Compression of integer sets and sequences has been extensively studied for settings where elements follow a uniform probability distribution. In addition, methods exist that exploit clustering of elements in order to achieve higher compression performance. In this work, we address the case where enumeration of elements may be arbitrary or random, but where statistics is kept in order to estimate probabilities of elements. We present a recursive subset-size encoding method that is able to benefit from statistics, explore the effects of permuting the enumeration order based on element probabilities, and discuss general properties and possibilities for this class of compression problem.
KW - Integer Compression
KW - Probability Distribution
KW - Clustering
KW - Subset-size Encoding
KW - Statistics-based Permutation
KW - Integer Compression
KW - Probability Distribution
KW - Clustering
KW - Subset-size Encoding
KW - Statistics-based Permutation
M3 - Net publication - Internet publication
ER -