TY - JOUR
T1 - Simple multi-party set reconciliation
AU - Mitzenmacher, Michael
AU - Pagh, Rasmus
PY - 2017
Y1 - 2017
N2 - Many distributed cloud-based services use multiple loosely consistent replicas of user information to avoid the high overhead of more tightly coupled synchronization. Periodically, the information must be synchronized, or reconciled. One can place this problem in the theoretical framework of set reconciliation: two parties A1A1 and A2A2 each hold a set of keys, named S1S1 and S2S2 respectively, and the goal is for both parties to obtain S1∪S2S1∪S2. Typically, set reconciliation is interesting algorithmically when sets are large but the set difference |S1−S2|+|S2−S1||S1−S2|+|S2−S1| is small. In this setting the focus is on accomplishing reconciliation efficiently in terms of communication; ideally, the communication should depend on the size of the set difference, and not on the size of the sets. In this paper, we extend recent approaches using Invertible Bloom Lookup Tables (IBLTs) for set reconciliation to the multi-party setting. There are three or more parties A1,A2,…,AnA1,A2,…,An holding sets of keys S1,S2,…,SnS1,S2,…,Sn respectively, and the goal is for all parties to obtain ∪iSi∪iSi. While this could be done by pairwise reconciliations, we seek more effective methods. Our general approach can function even if the number of parties is not exactly known in advance, and with some additional cost can be used to determine which other parties hold missing keys. Our methodology uses network coding techniques in conjunction with IBLTs, allowing efficiency in network utilization along with efficiency obtained by passing messages of size O(|∪iSi−∩iSi|)O(|∪iSi−∩iSi|). By connecting reconciliation with network coding, we can provide efficient reconciliation methods for a number of natural distributed settings.
AB - Many distributed cloud-based services use multiple loosely consistent replicas of user information to avoid the high overhead of more tightly coupled synchronization. Periodically, the information must be synchronized, or reconciled. One can place this problem in the theoretical framework of set reconciliation: two parties A1A1 and A2A2 each hold a set of keys, named S1S1 and S2S2 respectively, and the goal is for both parties to obtain S1∪S2S1∪S2. Typically, set reconciliation is interesting algorithmically when sets are large but the set difference |S1−S2|+|S2−S1||S1−S2|+|S2−S1| is small. In this setting the focus is on accomplishing reconciliation efficiently in terms of communication; ideally, the communication should depend on the size of the set difference, and not on the size of the sets. In this paper, we extend recent approaches using Invertible Bloom Lookup Tables (IBLTs) for set reconciliation to the multi-party setting. There are three or more parties A1,A2,…,AnA1,A2,…,An holding sets of keys S1,S2,…,SnS1,S2,…,Sn respectively, and the goal is for all parties to obtain ∪iSi∪iSi. While this could be done by pairwise reconciliations, we seek more effective methods. Our general approach can function even if the number of parties is not exactly known in advance, and with some additional cost can be used to determine which other parties hold missing keys. Our methodology uses network coding techniques in conjunction with IBLTs, allowing efficiency in network utilization along with efficiency obtained by passing messages of size O(|∪iSi−∩iSi|)O(|∪iSi−∩iSi|). By connecting reconciliation with network coding, we can provide efficient reconciliation methods for a number of natural distributed settings.
KW - set reconciliation
KW - synchronization
KW - multi-party
KW - network coding
KW - hashing
KW - sketching
KW - set reconciliation
KW - synchronization
KW - multi-party
KW - network coding
KW - hashing
KW - sketching
U2 - 10.1007/s00446-017-0316-0
DO - 10.1007/s00446-017-0316-0
M3 - Journal article
SN - 0178-2770
SP - 1
EP - 13
JO - Distributed Computing
JF - Distributed Computing
ER -