Text Mining for Information Systems Researchers: An Annotated Topic Modeling Tutorial

Stefan Debortoli, Oliver Müller, Iris Junglas, Jan vom Brocke

Publikation: Artikel i tidsskrift og konference artikel i tidsskriftTidsskriftartikelForskningpeer review

Abstract

t is estimated that more than 80 percent of today’s data is stored in unstructured form (e.g., text, audio, image, video);and much of it is expressed in rich and ambiguous natural language. Traditionally, the analysis of natural languagehas prompted the use of qualitative data analysis approaches, such as manual coding. Yet, the size of text data setsobtained from the Internet makes manual analysis virtually impossible. In this tutorial, we discuss the challengesencountered when applying automated text-mining techniques in information systems research. In particular, weshowcase the use of probabilistic topic modeling via Latent Dirichlet Allocation, an unsupervised text miningtechnique, in combination with a LASSO multinomial logistic regression to explain user satisfaction with an IT artifactby automatically analyzing more than 12,000 online customer reviews. For fellow information systems researchers,this tutorial provides some guidance for conducting text mining studies on their own and for evaluating the quality ofothers.
OriginalsprogEngelsk
Artikelnummer7
TidsskriftCommunications of the Association for Information Systems (CAIS)
Vol/bind39
Udgave nummer1
Antal sider28
ISSN1529-3181
DOI
StatusUdgivet - 2016

Fingeraftryk

Dyk ned i forskningsemnerne om 'Text Mining for Information Systems Researchers: An Annotated Topic Modeling Tutorial'. Sammen danner de et unikt fingeraftryk.

Citationsformater