Spring til hovednavigation Spring til søgning Spring til hovedindhold

Airavata: Introducing Hindi Instruction-tuned LLM

  • Jay P. Gala
  • , Thanmay Jayakumar
  • , Jaavid Aktar Husain
  • , Kumar M. Aswanth
  • , Mohammed Safi Ur Rahman Khan
  • , Diptesh Kanojia
  • , Ratish Puduppully
  • , Mitesh M. Khapra
  • , Raj Dabre
  • , V. Rudra Murthy
  • , Anoop Kunchukuttan
  • Indian Institute of Technology Madras
  • Indian Institute of Information Technology, Design and Manufacturing (Kancheepuram)
  • University of Surrey
  • Agency for Science, Technology and Research (A*Star)
  • National Institute Of Information And Communications Technology, Japan
  • IBM Research India
  • Microsoft India

Publikation: AndetAndet bidragForskning

Abstract

We announce the initial release of "Airavata," an instruction-tuned LLM for Hindi. Airavata was created by fine-tuning OpenHathi with diverse, instruction-tuning Hindi datasets to make it better suited for assistive tasks. Along with the model, we also share the IndicInstruct dataset, which is a collection of diverse instruction-tuning datasets to enable further research for Indic LLMs. Additionally, we present evaluation benchmarks and a framework for assessing LLM performance across tasks in Hindi. Currently, Airavata supports Hindi, but we plan to expand this to all 22 scheduled Indic languages. You can access all artifacts at this https URL.
OriginalsprogEngelsk
Publikationsdato2024
Vol/bindabs/2401.15006
Antal sider20
DOI
StatusUdgivet - 2024
Udgivet eksterntJa

Fingeraftryk

Dyk ned i forskningsemnerne om 'Airavata: Introducing Hindi Instruction-tuned LLM'. Sammen danner de et unikt fingeraftryk.

Citationsformater