ITU

Noise Corrected Sampling of Online Social Networks

Research output: Journal Article or Conference Article in JournalJournal articleResearchpeer-review

Standard

Noise Corrected Sampling of Online Social Networks. / Coscia, Michele.

In: ACM Transactions on Knowledge Discovery from Data, Vol. 15, No. 2, 29, 01.03.2021, p. 1-21.

Research output: Journal Article or Conference Article in JournalJournal articleResearchpeer-review

Harvard

APA

Vancouver

Author

Bibtex

@article{e33f3c69a577420e8d70de703a867f3e,
title = "Noise Corrected Sampling of Online Social Networks",
abstract = "In this article, we propose a new method to perform topological network sampling. Topological network sampling is a process for extracting a subset of nodes and edges from a network, such that analyses on the sample provide results and conclusions comparable to the ones they would return if run on whole structure. We need network sampling because the largest online network datasets are accessed through low-throughput application programming interface (API) systems, rendering the collection of the whole network infeasible. Our method is inspired by the literature on network backboning, specifically the noise-corrected backbone. We select the next node to explore by following the edge we identify as the one providing the largest information gain, given the topology of the sample explored so far. We evaluate our method against the most commonly used sampling methods. We do so in a realistic framework, considering a wide array of network topologies, network analysis, and features of API systems. There is no method that can provide the best sample in all possible scenarios, thus in our results section, we show the cases in which our method performs best and the cases in which it performs worst. Overall, the noise-corrected network sampling performs well: it has the best rank average among the tested methods across a wide range of applications.",
keywords = "network sampling, network backboning, social media, social networks",
author = "Michele Coscia",
year = "2021",
month = mar,
day = "1",
doi = "10.1145/3434749",
language = "English",
volume = "15",
pages = "1--21",
journal = "ACM Transactions on Knowledge Discovery from Data",
issn = "1556-4681",
publisher = "Association for Computing Machinery",
number = "2",

}

RIS

TY - JOUR

T1 - Noise Corrected Sampling of Online Social Networks

AU - Coscia, Michele

PY - 2021/3/1

Y1 - 2021/3/1

N2 - In this article, we propose a new method to perform topological network sampling. Topological network sampling is a process for extracting a subset of nodes and edges from a network, such that analyses on the sample provide results and conclusions comparable to the ones they would return if run on whole structure. We need network sampling because the largest online network datasets are accessed through low-throughput application programming interface (API) systems, rendering the collection of the whole network infeasible. Our method is inspired by the literature on network backboning, specifically the noise-corrected backbone. We select the next node to explore by following the edge we identify as the one providing the largest information gain, given the topology of the sample explored so far. We evaluate our method against the most commonly used sampling methods. We do so in a realistic framework, considering a wide array of network topologies, network analysis, and features of API systems. There is no method that can provide the best sample in all possible scenarios, thus in our results section, we show the cases in which our method performs best and the cases in which it performs worst. Overall, the noise-corrected network sampling performs well: it has the best rank average among the tested methods across a wide range of applications.

AB - In this article, we propose a new method to perform topological network sampling. Topological network sampling is a process for extracting a subset of nodes and edges from a network, such that analyses on the sample provide results and conclusions comparable to the ones they would return if run on whole structure. We need network sampling because the largest online network datasets are accessed through low-throughput application programming interface (API) systems, rendering the collection of the whole network infeasible. Our method is inspired by the literature on network backboning, specifically the noise-corrected backbone. We select the next node to explore by following the edge we identify as the one providing the largest information gain, given the topology of the sample explored so far. We evaluate our method against the most commonly used sampling methods. We do so in a realistic framework, considering a wide array of network topologies, network analysis, and features of API systems. There is no method that can provide the best sample in all possible scenarios, thus in our results section, we show the cases in which our method performs best and the cases in which it performs worst. Overall, the noise-corrected network sampling performs well: it has the best rank average among the tested methods across a wide range of applications.

KW - network sampling

KW - network backboning

KW - social media

KW - social networks

U2 - 10.1145/3434749

DO - 10.1145/3434749

M3 - Journal article

VL - 15

SP - 1

EP - 21

JO - ACM Transactions on Knowledge Discovery from Data

JF - ACM Transactions on Knowledge Discovery from Data

SN - 1556-4681

IS - 2

M1 - 29

ER -

ID: 85885067