Rudali Huidrom's work features in Wikipedia
Source: Chronicle News Service
Imphal, August 27 2022:
Rudali Huidrom's work related to developing an online database of Manipuri language to assist in machine translation of Manipuri language (Meeteilol) into other languages has been featured in Wikipedia, which is the largest free online encyclopedia.
Wikipedia is an online encyclopedia that anyone can edit.
Its main purpose is to benefit readers by acting as a widely accessible and free encyclopedia that contains information on all branches of knowledge.
It is supported by the Wikimedia Foundation and consists of freely editable contents.
Wikipedia is written collaboratively by largely anonymous volunteers.
Anyone with Internet access and in good standing can write and make changes to Wikipedia articles, except in limited cases where editing is restricted to prevent disruption or vandalism.
Since its creation on January 15, 2001, Wikipedia has grown into the world's largest reference website, attracting over a billion visitors monthly.
It currently has more than 59 million articles in more than 300 languages, including 6,543,425 articles in English with 114,504 active contributors in the past month.
It has 279,771 active editors and 103,476,599 registered users.
Articles in Wikipedia are available in 329 languages.
The Wikipedia article on Rudali Huidrom's work states, "In 2021, Rudali Huidrom, a Manipuri researcher of the EBMT/NLP laboratory, Waseda University, Japan, created a text corpus named 'EM Corpus" (shortened form of "Emalon Manipuri Corpus") .
It is the first comparable text to text corpus built for Meitei language (mni) and English language (eng) pair from sentences.
The writing system used for Meitei language in this corpus is Bengali script.
It was crawled and collected from the news website of a daily newspaper of Manipur from August 2020 to 2021.In version 1, she created the monolingual data, having 1,034,715 Meitei language sentences and 846,796 English language sentences.
In version 2, she created the monolingual data, having 1,880,035 Meitei language sentences and 1,450,053 English language sentences".
"EM-ALBERT is the first ALBERT model available for Meitei language.
EM-FT is also FastText word embedding available for Meitei language.
These resources were created by Rudali Huidrom and are now available at free of cost at the European Language Resources Association catalogue (ELRA catalogue) under CC-BY-NC-4.0 license".
The article has been contributed by a user called Haoreima, who has already contributed numerous articles in Wikipedia.
The user had won several awards for the contributions which include first place in WikiGap Challenge (Global ranking - 2022), Feminism and Folklore (2022), Wikipedia Asian Month (2021), third place in Wiki Loves Women South Asia (2021) and 12th place in Ukraine's Cultural Diplomacy Month (Global ranking - 2022).Mention may be made that on May 11, 2022, Google Translate added Manipuri during its addition of 24 new languages to the translation tool.
The writing system used for Manipuri language in this tool is Meetei script.
Rudali took up the task during her research on equipping Manipuri language for machine translation tasks at Waseda University, Japan.
She published her thesis titled 'Machine Translation for a less-resourced language: Manipuri (Meiteilon)' under the supervision of professor Yves Lepage.
She is now undergoing PhD on 'Semantic & Stylistic controllability in Natural Language Generation' under the supervision of professor Anya Belz at Dublin City University, Dublin, Ireland.