Losing our language
Artificial intelligence preserves endangered languages.
Language as we know it originated approximately 100,000 years ago as our species began to flourish. What began as gestures and grunts evolved into what you are reading today. Many of the world’s nearly 7,000 languages are categorized into one of the six major language families spread across the globe. However, nearly 40% of those languages are endangered.
An endangered language is one that is likely to die out with its speakers in the near future. Indigenous communities typically fall victim to this fate due to their minority status in countries where more popular languages are spoken. Despite the thousands of “living” languages used today, over 88% of the world’s population speaks one of only 23 languages. On the internet, the contrast is even starker, with English and Chinese constituting nearly half of all online content.
While technology is blamed for limiting language online, artificial intelligence is on a mission to preserve disappearing languages. Exciting projects led by tech giants and inspiring individuals alike are changing the face of what experts are calling a “global linguistic catastrophe.” While technology cannot fully capture the experience of human language, perhaps the efforts can save some from extinction.
Changing times (and tongues)
Spoken language is a fundamental communication tool that has kept cultures alive for millennia. Indigenous communities often preserve customs by telling stories in their native tongue. However, when the number of speakers dwindles, certain languages become endangered.
For example, there’s 92-year-old Cristina Calderon, a Yagán member who realized she was one of the last people on the planet to speak her native Yamana language. Stories like these are becoming increasingly common. Her fear that she will die with the language could be the dark reality.
But how does the language of a distant community affect you? Some may think that this is evolution taking its course. Many indigenous or ancient languages are the foundation of today’s communication. Ancient Greek and Latin evolved into modern European languages. When language is lost, so is a people’s culture, customs, and social identity. History is erased, knowledge is forgotten and traditions are shattered. We all benefit from the diversity of thought. Language is no exception.
Thanks to modern technology, there are ways to halt extinction, even as communities shrink. From audio recordings, workshops, publications, and now artificial intelligence, preservation is taking a new form.
Learning language artificially
Anyone who knows a second language understands the dedication needed to become fluent. With machine learning, this intensive process is achieved much more quickly and efficiently. Several projects are underway to turn back the clock on fleeting languages.
- OBTranslate is a community of language translators using a deep learning and neural machine translation tool to bridge the communication gap. By building an open-source multilingual data repository of over 2,000 African languages, mainly local dialects, the languages are documented like never before. Professional subtitling services and multilingual messaging assistance are other use cases they are exploring to circulate African languages across the globe.
- Chatbots like Reobot help learners of te reo Māori, the indigenous language of New Zealand, converse with one another. This Facebook bot responds to users in both te reo Māori and English. Jason Lovell, the bot’s creator, wants to add pronunciation features to take learning to the next level. IBM’s AI, which powers Reobot, may soon use a novel technique that allows AI to understand different languages while only being trained on one.
- Robots like Opie aim to achieve a similar goal by teaching indigenous Australian languages to children in remote parts of the country. The researchers partnered with TensorFlow, Google’s open-source AI platform, to transcribe tens of thousands of hours’ worth of language recordings.
- FirstVoices, powered by the Nuxeo open-source AI platform, creates keyboard apps and learning materials for aspiring students. Drawing on British Columbia’s rich indigenous history, they are reinvigorating a new generation of speakers.

Misunderstanding the minority
If what we see online is accessible to only a few languages, then the Internet’s promise of “information for all” falls flat. Digital accessibility is often designed to aid physical limitations, such as visual impairments, or meet usability standards like legibility and navigation. However, accessibility also entails the free movement of information.
Beyond AI’s ability to preserve endangered languages, it also aids in monitoring how minority languages are used, sometimes harmfully, online. Largely English-speaking human moderators at large companies like Facebook often do not have the multilingual skills necessary to recognize offensive content in foreign languages.
Training AI with minority language data increases the speed and accuracy of flagging dangerous content that otherwise could be passed over as just another post. The scalability of this approach saves humans the time of not only learning new languages, but the misfortune of having to confront negativity on a daily basis.
An evolving effort
Artificial intelligence is propelling language preservation in the right direction. As technology evolves so does language and how we communicate. The rise of open source information is a blessing to all who wish to not only learn new languages, but protect their cultures and stories.
If you would like to save a piece of history, please support these organizations: