ITU

A structural query system for Han characters

Research output: Journal Article or Conference Article in JournalJournal articleResearchpeer-review

View graph of relations

The IDSgrep structural query system for Han character dictionaries is presented. This dictionary search system represents the spatial structure of Han characters using Extended Ideographic Description Sequences (EIDSes), a data model and syntax based on the Unicode IDS concept. It includes a query language for EIDS databases, with a freely available implementation and format translation from popular third-party IDS and XML character databases. The system is designed to suit the needs of font developers and foreign language learners. The search algorithm includes a bit vector index inspired by Bloom filters to support faster query operations. Experimental results are presented, evaluating the effect of the indexing on query performance.
Original languageEnglish
JournalInternational Journal of Asian Language Processing
Volume23
Issue number2
Pages (from-to)127-159
Publication statusPublished - Jan 2016

ID: 80538385