SHR++: An Interface for Morpho-syntactic annotation of Sanskrit Corpora

Amrith Krishna, Shiv Vidhyut, Dilpreet Chawla, Sruti Sambhavi, Pawan Goyal

Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

Abstract

We propose a web-based annotation framework, SHR++, for morpho-syntactic annotation of corpora in Sanskrit. SHR++ is designed
to generate annotations for the word-segmentation, morphological parsing and dependency analysis tasks in Sanskrit. It incorporates
analyses and predictions from various tools designed for processing texts in Sanskrit, and utilises them to ease the cognitive load of
the human annotators. Specifically, SHR++ uses Sanskrit Heritage Reader (Goyal and Huet, 2016), a lexicon driven shallow parser
for enumerating all the phonetically and lexically valid word splits along with their morphological analyses for a given string. This
would help the annotators in choosing the solutions, rather than performing the segmentations by themselves. Further, predictions from
a word segmentation tool (Krishna et al., 2018) are added as suggestions that can aid the human annotators in their decision making.
Our evaluation shows that enabling this segmentation suggestion component reduces the annotation time by 20.15 %. SHR++ can
be accessed online at http://vidhyut97.pythonanywhere.com/ and the codebase, for the independent deployment of the system elsewhere, is hosted at https://github.com/iamdsc/smart-sanskrit-annotator.
Original languageEnglish
Title of host publicationProceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020),
PublisherAssociation for Computational Linguistics
Publication dateFeb 2020
Pages7069–7076
Publication statusPublished - Feb 2020

Keywords

  • web-based annotation framework
  • morpho-syntactic annotation
  • Sanskrit corpus
  • word-segmentation
  • morphological parsing
  • dependency analysis
  • Sanskrit Heritage Reader
  • cognitive load reduction
  • annotation tool evaluation
  • online annotation platform

Fingerprint

Dive into the research topics of 'SHR++: An Interface for Morpho-syntactic annotation of Sanskrit Corpora'. Together they form a unique fingerprint.

Cite this