Tech

What International Mother Language Day Means for Indian Languages

While India's major languages have transitioned well to the digital realm, smaller local languages still have a long way to go.

Nearly seventy years ago, a group of students from the Dhaka University protested against the then-East Pakistan police, in an effort to have Bengali recognised as an official language.

These activists eventually forced the Pakistan government to not impose Urdu as the national language for the region that later became Bangladesh. This day, February 21, is celebrated worldwide as the International Mother Language Day (IMLD).

IMLD this year assumes greater significance as 2019 is celebrated as the International Year of Indigenous Languages – a campaign to promote widespread use of indigenous languages – by UNESCO and many of its partners worldwide.

The real question is what this really means for the Indian languages in general and the indigenous languages of the country particularly.

A language is not just a means of communications, but a device for politics that gives individuals their identity. There comes a stagnancy that lead to a slow death if a language is not used for native-language-based education and governance. Sanskrit, for instance, despite being one of the oldest South Asian languages is now limited to only literary and religious studies because of the lack of the aforementioned usages.

In a country like India, where only 22 out of 750 languages are recognised by the Union as part of the 8th Schedule of the Constitution, there is a dire need for a lot more action to further the use of native languages in all aspects. Ganesh N. Devy, the founder of People’s Linguistic Survey of India (PSLI), a citizen science initiative to survey the use of Indian languages in natural settings, warned that almost half of India’s languages might die within 50 years.

Also Read: Declining Presence of Indian Language Publications at World Book Fair

Thanks to the fast pace of the internet – both languages and the writing systems are at great risk as the physical distance between people and places are coming down. Hinting at this very issue, says Tim Brookes, founder of the Atlas of Endangered Alphabets (project to preserve writing systems that may soon disappear), “Everyone had a screen or wanted a screen, and the English language and the Latin alphabet (or one of the half-dozen other major writing systems) were on every screen and every keyboard”.

This puts at a great disadvantage those who could only read and write, say, Mandombe, Wancho, or Hanifi Rohingya, he says. “The issues with the disappearance of languages and writing systems certainly invoke many to seek what can be done to protect languages from any potential danger and help it grow,” Brookes says.

A Google-KPMG study in 2017 titled “Indian languages: Defining India’s Internet” highlights that the Indian-language user base will be 2.5 times that of the English users online. That simply means that the focus of the key internet stakeholders in the country will dramatically drift from English to Indian languages. With about 478 million Indian mobile internet users – out of which 197 come from rural India – which constitute 59% of the total internet user population, India is slowly becoming the largest internet democracy.

For a fast moving app-based m-commerce market like that of India, one way forward for companies and anything online involving people will have to be “Indian-language first”. Many companies have already started adopting multilingualism on their interface and there are many more to do so. This transition is very exciting as enabling multilingualism on any online platform needs the input and participation of native-language speakers translating to more job opportunity.

Before content and interface started to become multilingual for several apps, languages were not business commodity but just devices for literary works. For someone like myself, who has worked in the Indian-language internet sphere over a decade, this change is very inspiring.

This leads to the next question – what kind of resources native language speakers have to build to ensure that their language is internet-friendly so that it is not missed out from any kind of internet-isation rollout.

Generally-speaking, there are three crucial aspects of a language and its writing system that, in my opinion, are a must for any language.

Unicode standard

Unicode is an international standard to ensure consistent encoding for a writing system (script). As it is a universal industry standard, all major industry players – from companies like Ubuntu, Microsoft and Apple that make operating systems and software to mobile manufacturers, to app developers and even conventional print media – strive to use Unicode the same exact way.

Type/font designers ensure following the Unicode standard of a writing system while designing fonts. If native speakers of any language (and its respective writing system) don’t ensure the wider use of Unicode in all levels while creating content, there is a chance that their language will fall out of the internet bandwagon.

My own language Odia, which is one of the three oldest languages of the subcontinent, still has not been included in the Google India search page or Google Translate because of wider use of non-standard fonts namely Akruti and Aprant and low use of Unicode-compliant fonts.

While the major languages of India have many good quality Unicode-compliant fonts, the indigenous languages have a long way to go. Good quality fonts matter the most, as user experience plays a vital role in terms of audience engagement. A badly-designed font does not keep the audience hooked to any user engagement platform (say a website or an app).

Corpus, dictionary, and a Wikipedia of its own

For development of many linguistics tools there are some prerequisites – a long list of all words in the language are required for building a spelling checker, a dictionary containing translation of words and phrases and simple sentences for machine translation tools, audio recording of words that are useful for speech synthesis (text-to-speech that have wider application including tools like Google Assistant and automated voice-based services like Interactive Voice Response (IVR) that are used across industry to automate customer support).

Similarly, regular publications online contribute to the wider use of the language (and its writing system) online so the language gets a priority from bigger internet players like Google when it comes to machine translation tools like Google Translate.

Also Read: India’s Endangered Languages Need to Be Digitally Documented

Wikipedia, due to its nature as an encyclopedia with a collaborative editing model, has information about a wide range of topics that are contributed by volunteer editors (yes, anyone can create and edit the Wikipedia articles), is a great place for anyone to grow meaningful content in their own language online.

Because of open-licensing (particularly Creative Commons licenses) for the content, Wikipedia’s text can be distributed easily without getting into copyright infringement. As a really popular website on the internet, Wikipedia’s content often is found in the first page of any web search (like Google). There are 292 active Wikipedias in 292 world languages whereas one, whose language Wikipedia has not yet created, can contribute to the Incubator project, and even request if none has already started one. Incubator project is a gateway to creating a new Wikipedia in a language that has its own writing system and the same already has a Unicode standard.

Wikipedia is an essential encyclopedia to preserve languages. Credit: Screegrab/wikipedia.org

Ensuring elementary education and governance in native language

Global Partnership for Education, British Council and many others involved in in the area of children’s education strongly recommend for a native-language-based elementary education as children learn faster in their “mother tongues”.

Knowledge shared in a native language invokes the sense of community and identity. In all kinds of governance structures, a language plays a key role in engaging with different major stakeholders – government and leadership, judiciary and law enforcement and common people.

The process of uniting people via a language has hardly ever changed despite the widespread use of major languages like English or French or Mandarin and the fast shift to a digital era.

It is here that an ocean of opportunity lies for native speakers of all languages, but especially the ones that are underrepresented.

Subhashish Panigrahi is an multilingual internet advocate and has worked as a community manager across global nonprofits like Internet Society, Mozilla and Wikimedia Foundation and Indian research organisations like the Centre for Internet and Society. He founded OpenSpeaks to grow openly-licensed resources for marginalised languages and co-founded O Foundation (OFDN).